论文阅读术语整理总结

绝对值整流 absolute value rectification

准确率 accuracy

声学 acoustic

激活函数 activation function

AdaGrad AdaGrad

对抗 adversarial

对抗样本 adversarial example

对抗训练 adversarial training

几乎处处 almost everywhere

几乎必然 almost sure

几乎必然收敛 almost sure convergence

选择性剪接数据集 alternative splicing dataset

原始采样 Ancestral Sampling

退火重要采样 annealed importance sampling

专用集成电路 application-specific integrated circuit

近似贝叶斯计算 approximate Bayesian computation

近似推断 approximate inference

架构 architecture

人工智能 artificial intelligence

人工神经网络 artificial neural network

渐近无偏 asymptotically unbiased

异步随机梯度下降 Asynchoronous Stochastic Gradient Descent

异步 asynchronous

注意力机制 attention mechanism

属性 attribute

自编码器 autoencoder

自动微分 automatic differentiation

自动语音识别 Automatic Speech Recognition

自回归网络 auto-regressive network

反向传播 back propagate

反向传播 back propagation

回退 back-off

反向传播 backprop

通过时间反向传播 back-propagation through time

反向传播 backward propagation

词袋 bag of words

Bagging bootstrap aggregating bandit bandit

批量 batch

批标准化 batch normalization

贝叶斯误差 Bayes error

贝叶斯规则 Bayes’ rule

贝叶斯推断 Bayesian inference

贝叶斯网络 Bayesian network

贝叶斯概率 Bayesian probability

贝叶斯统计 Bayesian statistics

基准 bechmark

信念网络 belief network Bernoulli

分布 Bernoulli distribution

基准 baseline

BFGS BFGS

偏置 bias in affine function

偏差 bias in statistics

有偏 biased

有偏重要采样 biased importance sampling 偏差 biass

二元语法 bigram

二元关系 binary relation

二值稀疏编码 binary sparse coding

比特 bit

块坐标下降 block coordinate descent

块吉布斯采样 block Gibbs Sampling

玻尔兹曼分布 Boltzmann distribution

玻尔兹曼机 Boltzmann Machine Boosting Boosting

桥式采样 bridge sampling

广播 broadcasting

磨合 Burning-in

变分法 calculus of variations

容量 capacity

级联 cascade

灾难遗忘 catastrophic forgetting

范畴分布 categorical distribution

因果因子 causal factor

因果模型 causal modeling

中心差分 centered difference

中心极限定理 central limit theorem

链式法则 chain rule

混沌 chaos

弦 chord

弦图 chordal graph

梯度截断 clip gradient

截断梯度 clipping the gradient

团 clique

团势能 clique potential

闭式解 closed form solution

级联 coalesced

编码 code

协同过滤 collaborative filtering

列 column

列空间 column space

共因 common cause

完全图 complete graph

复杂细胞 complex cell

计算图 computational graph

计算机视觉 Computer Vision

概念漂移 concept drift

条件计算 conditional computation

条件概率 conditional probability

条件独立的 conditionally independent

共轭 conjugate

共轭方向 conjugate directions

共轭梯度 conjugate gradient

联结主义 connectionism

一致性 consistency

约束优化 constrained optimization

特定环境下的独立 contextual bandit contextual bandit

延拓法 continuation method

收缩 contractive

收缩自编码器 contractive autoencoder

对比散度 contrastive divergence

凸优化 Convex optimization

卷积 convolution

卷积玻尔兹曼机 Convolutional Boltzmann Machine

卷积玻尔兹曼机 convolutional Boltzmann machine

卷积网络 convolutional net

卷积网络 convolutional network

卷积神经网络 convolutional neural network

坐标上升 coordinate ascent

坐标下降 coordinate descent

共父 coparent

相关系数 correlation

代价 cost

代价函数 cost function

协方差 covariance

协方差矩阵 covariance matrix

协方差 RBM covariance RBM

覆盖 coverage

准则 criterion

临界点 critical point

临界温度 critical temperatures

互相关函数 cross-correlation

交叉熵 cross-entropy

累积函数 cumulative function

课程学习 curriculum learning

维数灾难 curse of dimensionality

控制论 cybernetics

衰减 damping

数据生成分布 data generating distribution

数据生成过程 data generating process

数据并行 data parallelism

数据点 data point

数据集 dataset

数据集增强 dataset augmentation

决策树 decision tree

解码器 decoder

分解 decompose

深度信念网络 deep belief network

深度玻尔兹曼机 Deep Boltzmann Machine

深度回路 deep circuit

深度前馈网络 deep feedforward network

深度生成模型 deep generative model

深度学习 deep learning

深度模型 deep model

深度网络 deep network

信任度 degree of belief

去噪 denoising

去噪自编码器 denoising autoencoder

去噪得分匹配 denoising score matching

依赖 dependency

深度 depth

导数 derivative

描述 description

设计矩阵 design matrix

细致平衡 detailed balance

探测级 detector stage

确定性 deterministic

对角矩阵 diagonal matrix

微分熵 differential entropy

微分方程 differential equation

降维 dimensionality reduction

Dirac delta 函数 Dirac delta function

Dirac 分布 dirac distribution

有向 directed

有向图模型 directed graphical model

有向模型 Directed Model

方向导数 directional derivative

判别 RBM discriminative RBM

判别器网络 discriminator network

分布式表示 distributed representation

深度神经网络 DNN

领域自适应 domain adaption

点积 dot product

双反向传播 double backprop

双重分块循环矩阵 doubly block circulant matrix

降采样 downsampling

Dropout Dropout Dropout Boosting Dropout Boosting

d-分离 d-separation

动态规划 dynamic programming

动态结构 dynamic structure

提前终止 early stopping

回声状态网络 echo state network

有效容量 effective capacity

特征分解 eigendecomposition

特征值 eigenvalue

特征向量 eigenvector

基本单位向量 elementary basis vectors

元素对应乘积 element-wise product

嵌入 embedding

经验分布 empirical distribution

经验频率 empirical frequency

经验风险 empirical risk

经验风险最小化 empirical risk minimization

编码器 encoder

端到端的 end-to-end

能量函数 energy function

基于能量的模型 Energy-based model

集成 ensemble

集成学习 ensemble learning

epoch 轮数

epochs 等式约束

equality constraint

均衡分布 Equilibrium Distribution

等变 equivariance

等变表示 equivariant representations

误差条 error bar

误差函数 error function

误差度量 error metric

错误率 error rate

估计量 estimator

欧几里得范数 Euclidean norm

欧拉-拉格朗日方程 Euler-Lagrange Equation

证据下界 evidence lower bound

样本 example

额外误差 excess error

期望 expectation

期望最大化 expectation maximization

E 步 expectation step

期望值 expected value

经验 experience

专家网络 expert network

相消解释 explaining away

相消解释作用 explaining away effect

解释因子 explanatory factort

梯度爆炸 exploding gradient

利用 exploitation

探索 exploration

指数分布 exponential distribution

因子 factor

因子分析 factor analysis

因子图 factor graph

因子 factorial

分解 factorization

分解的 factorized

变差因素 factors of variation

快速 Dropout fast dropout

快速持续性对比散度 fast persistent contrastive divergence

可行 feasible

特征 feature

特征提取器 feature extractor

特征映射 feature map

特征选择 feature selection

反馈 feedback

前向 feedforward

前馈分类器 feedforward classifier

前馈网络 feedforward network

前馈神经网络 feedforward neural network

现场可编程门阵列

精调 fine-tune

精调 fine-tuning

有限差分 finite difference

第一层 first layer

不动点方程 fixed point equation

定点运算 fixed-point arithmetic

翻转 flip

浮点运算 float-point arithmetic

遗忘门 forget gate

前向模式累加 forward mode accumulation

前向传播 forward propagation

傅立叶变换 Fourier transform

中央凹 fovea

自由能 free energy

频率派概率 frequentist probability

频率派统计 frequentist statistics Frobenius

范数 Frobenius norm

F 分数 F-score

全 full

泛函 functional

泛函导数 functional derivative

Gabor 函数 Gabor function

Gamma 分布 Gamma distribution

门控 gated

门控循环网络 gated recurrent net

门控循环单元 gated recurrent unit

门控 RNN gated

RNN 选通器

gater 高斯分布

Gaussian distribution

高斯核 Gaussian kernel

高斯混合模型 Gaussian Mixture Model

高斯混合体 Gaussian mixtures

高斯输出分布 Gaussian output distribution

高斯 RBM Gaussian

RBM Gaussian-Bernoulli

RBM 通用 GPU general purpose

GPU 泛化 generalization

泛化误差 generalization error

泛化 generalize

广义函数 generalized function

广义 Lagrange 函数 generalized Lagrange function

广义 Lagrangian generalized Lagrangian

广义伪似然 generalized pseudolikelihood

广义伪似然估计 generalized pseudolikelihood estimator

广义得分匹配 generalized score matching

生成式对抗框架 generative adversarial framework

生成式对抗网络 generative adversarial network

生成模型 generative model

生成式建模 generative modeling

生成矩匹配网络 generative moment matching network

生成随机网络 generative stochastic network

生成器网络 generator network

吉布斯分布 Gibbs distribution Gibbs

采样 Gibbs Sampling

吉布斯步数 Gibbs steps

全局对比度归一化 Global contrast normalization

全局极小值 global minima

全局最小点 global minimum

梯度 gradient

梯度上升 gradient ascent

梯度截断 gradient clipping

梯度下降 gradient descent

图模型 graphical model

图形处理器 Graphics Processing Unit

贪心 greedy 贪心算法 greedy algorithm

贪心逐层预训练 greedy layer-wise pretraining

贪心逐层训练 greedy layer-wise training

贪心逐层无监督预训练 greedy layer-wise unsupervised pretraining

贪心监督预训练 greedy supervised pretraining

贪心无监督预训练 greedy unsupervised pretraining

网格搜索 grid search Hadamard

乘积 Hadamard product

汉明距离 Hamming distance

硬专家混合体 hard mixture of experts

硬双曲正切函数 hard tanh

簧风琴 harmonium

哈里斯链 Harris Chain Helmholtz

机 Helmholtz machine Hessian Hessian

异方差 heteroscedastic

隐藏层 hidden layer

隐马尔可夫模型 Hidden Markov Model

隐藏单元 hidden unit

隐藏变量 hidden variable

爬山 hill climbing

超参数 hyperparameter

超参数优化 hyperparameter optimization

假设空间 hypothesis space

同分布的 identically distributed

可辨认的 identifiable

单位矩阵 identity matrix

独立同分布假设 i.i.d. assumption

病态 ill conditioning

不道德 immorality

重要采样 Importance Sampling

相互独立的 independent

独立成分分析 independent component analysis

独立同分布 independent identically distributed

独立子空间分析 independent subspace analysis

索引 index of matrix

指示函数 indicator function

不等式约束 inequality constraint

推断 inference

无限 infinite

信息检索 information retrieval

内积 inner product

输入 input

输入分布 input distribution

干预查询 intervention query

不变 invariant

求逆 invert Isomap Isomap

各向同性 isotropic Jacobian Jacobian Jacobian

矩阵 Jacobian matrix

联合概率分布 joint probability distribution Karush–Kuhn–Tucker Karush–Kuhn–Tucker

核函数 kernel function

核机器 kernel machine

核方法 kernel method

核技巧 kernel trick KL

散度 KL divergence

知识库 knowledge base

知识图谱 knowledge graph Krylov

方法 Krylov method KL

散度 Kullback-Leibler (KL) divergence

标签 label

标注 labeled

拉格朗日乘子 Lagrange multiplier

语言模型 language model Laplace

分布 Laplace distribution

大学习步骤 large learning step

潜在 latent

潜层 latent layer

潜变量 latent variable

大数定理 Law of large number

逐层的 layer-wise

渗漏整流线性单元 Leaky ReLU

渗漏单元 leaky unit

学成 learned

学习近似推断 learned approximate inference

学习器 learner

学习率 learning rate

勒贝格可积 Lebesgue-integrable

左特征向量 left eigenvector

左奇异向量 left singular vector

莱布尼兹法则 Leibniz’s rule

似然 likelihood

线搜索 line search

线性自回归网络 linear auto-regressive network

线性分类器 linear classifier

线性组合 linear combination

线性相关 linear dependence

线性因子模型 linear factor model

线性模型 linear model

线性回归 linear regression

线性阀值单元 linear threshold units

线性无关 linearly independent

链接预测 link prediction

链接重要采样 linked importance sampling Lipschitz Lipschitz Lipschitz

常数 Lipschitz constant Lipschitz

连续 Lipschitz continuous

流体状态机 liquid state machine

局部条件概率分布 local conditional probability distribution

局部不变性先验 local constancy prior

局部对比度归一化 local contrast normalization

局部下降 local descent

局部核 local kernel

局部极大值 local maxima

局部极大点 local maximum

局部极小值 local minima

局部极小点 local minimum

对数尺度 logarithmic scale

逻辑回归 logistic regression logistic sigmoid logistic sigmoid

分对数 logit

对数线性模型 log-linear model

长短期记忆 long short-term memory

长期依赖 long-term dependency

环 loop

环状信念传播 loopy belief propagation

损失 loss

损失函数 loss function

机器学习 machine learning

机器学习模型 machine learning model

机器翻译 machine translation

主对角线 main diagonal

流形 manifold

流形假设 manifold hypothesis

流形学习 manifold learning

边缘概率分布 marginal probability distribution

马尔可夫链 Markov Chain

马尔可夫链蒙特卡罗 Markov Chain Monte Carlo

马尔可夫网络 Markov network

马尔可夫随机场 Markov random field

掩码 mask

矩阵 matrix

矩阵逆 matrix inversion

矩阵乘积 matrix product

最大范数 max norm

池 pool

最大池化 max pooling

极大值 maxima M

步 maximization step

最大后验 Maximum A Posteriori

最大似然 maximum likelihood

最大似然估计 maximum likelihood estimation

最大平均偏差 maximum mean discrepancy maxout maxout maxout

单元 maxout unit

平均绝对误差 mean absolute error

均值和协方差 RBM mean and covariance

RBM 学生 t

分布均值乘积 mean product of Student t-distribution

均方误差 mean squared error

均值-协方差 RBM mean-covariance restricted

Boltzmann machine

均匀场 meanfield

均值场 mean-field

测度论 measure theory

零测度 measure zero

记忆网络 memory network

信息传输 message passing

小批量 minibatch

小批量随机 minibatch stochastic

极小值 minima

极小点 minimum

混合 Mixing

混合时间 Mixing Time

混合密度网络 mixture density network

混合分布 mixture distribution

专家混合体 mixture of experts

模态 modality

峰值 mode

模型 model

模型平均 model averaging

模型压缩 model compression

模型可辨识性 model identifiability

模型并行 model parallelism

矩 moment

矩匹配 moment matching

动量 momentum

蒙特卡罗 Monte Carlo Moore-Penrose

伪逆道德化 moralization

道德图 moralized graph

多层感知机 multilayer perceptron

多峰值 multimodal

多模态学习 multimodal learning

多项式分布 multinomial distribution Multinoulli

分布 multinoulli distribution

多预测深度玻尔兹曼机 multi-prediction deep Boltzmann machine

多任务学习 multitask learning

多维正态分布 multivariate normal distribution

朴素贝叶斯 naive

Bayes 奈特 nats

自然语言处理 Natural Language Processing

最近邻 nearest neighbor

最近邻图 nearest neighbor graph

最近邻回归 nearest neighbor regression

负定 negative definite

负部函数 negative part function

负相 negative phase

半负定 negative semidefinite Nesterov

动量 Nesterov momentum

网络 network

神经自回归密度估计器 neural auto-regressive density estimator

神经自回归网络 neural auto-regressive network

神经语言模型 Neural Language Model

神经机器翻译 Neural Machine Translation

神经网络 neural network

神经网络图灵机 neural Turing machine

牛顿法 Newton’s method n-gram n-gram

没有免费午餐定理 no free lunch theorem

噪声 noise

噪声分布 noise distribution

噪声对比估计 noise-contrastive estimation

非凸 nonconvex 非分布式 nondistributed

非分布式表示 nondistributed representation

非线性共轭梯度 nonlinear conjugate gradients

非线性独立成分估计 nonlinear independent components estimation

非参数 non-parametric

范数 norm

正态分布 normal distribution

正规方程 normal equation

归一化的 normalized

标准初始化 normalized initialization

数值 numeric value

数值优化 numerical optimization

对象识别 object recognition

目标 objective

目标函数 objective function

奥卡姆剃刀 Occam’s razor one-hot one-hot

一次学习 one-shot learning

在线 online

在线学习 online learning

操作 operation

最佳容量 optimal capacity

原点 origin

正交 orthogonal

正交矩阵 orthogonal matrix

标准正交 orthonormal

输出 output

输出层 output layer

过完备 overcomplete

过估计 overestimation

过拟合 overfitting

过拟合机制 overfitting regime

上溢 overflow

并行分布式处理 Parallel Distributed Processing

并行回火 parallel tempering

参数 parameter

参数服务器 parameter server

参数共享 parameter sharing

有参情况 parametric case

参数化整流线性单元 parametric ReLU

偏导数 partial derivative

配分函数 Partition Function

性能度量 performance measures

性能度量 performance metrics

置换不变性 permutation invariant

持续性对比散度 persistent contrastive divergence

音素 phoneme

语音 phonetic

分段 piecewise

点估计 point estimator

策略 policy

策略梯度 policy gradient

池化 pooling

池化函数 pooling function

病态条件 poor conditioning

正定 positive definite

正部函数 positive part function

正相 positive phase

半正定 positive semidefinite

后验概率 posterior probability

幂方法 power method

PR 曲线 PR curve

精度 precision

精度矩阵 precision matrix

预测稀疏分解 predictive sparse decomposition

预训练 pretraining

初级视觉皮层 primary visual cortex

主成分分析 principal components analysis

先验概率 prior probability

先验概率分布 prior probability distribution

概率 PCA probabilistic PCA

概率密度函数 probability density function

概率分布 probability distribution

概率质量函数 probability mass function

专家之积 product of expert

乘法法则 product rule

成比例 proportional

提议分布 proposal distribution

伪似然 pseudolikelihood

象限对 quadrature pair

量子力学 quantum mechanics

径向基函数 radial basis function

随机搜索 random search

随机变量 random variable

值域 range

比率匹配 ratio matching

召回率 recall

接受域 receptive field

再循环 recirculation

推荐系统 recommender system

重构 reconstruction

重构误差 reconstruction

整流线性 rectified linear

整流线性变换 rectified linear transformation

整流线性单元 rectified linear unit

整流网络 rectifier network

循环 recurrence

循环卷积网络 recurrent convolutional network

循环网络 recurrent network

循环神经网络 recurrent neural network

回归 regression

正则化 regularization

正则化 regularize

正则化项 regularizer

强化学习 reinforcement learning

关系 relation

关系型数据库 relational database

重参数化 reparametrization

重参数化技巧 reparametrization trick 表

示 representation

表示学习 representation learning

表示容量 representational capacity

储层计算 reservoir computing

受限玻尔兹曼机 Restricted Boltzmann Machine

反向相关 reverse correlation

反向模式累加 reverse mode accumulation

岭回归 ridge regression

右特征向量 right eigenvector

右奇异向量 right singular vector

风险 risk

行 row

扫视 saccade

鞍点 saddle point

无鞍牛顿法 saddle-free Newton method

相同 same

样本均值 sample mean

样本方差 sample variance

饱和 saturate

标量 scalar

得分 score

得分匹配 score matching

二阶导数 second derivative

二阶导数测试 second derivative test

第二层 second layer

二阶方法 second-order method

自对比估计 self-contrastive estimation

自信息 self-information

语义哈希 semantic hashing

半受限玻尔兹曼机 semi-restricted Boltzmann Machine

半监督 semi-supervised

半监督学习 semi-supervised learning

可分离的 separable

分离的 separate

分离 separation

情景 setting

浅度回路 shadow circuit

香农熵 Shannon entropy

香农 shannons

塑造 shaping

短列表 shortlist sigmoid sigmoid sigmoid

信念网络 sigmoid Belief Network

简单细胞 simple cell

奇异的 singular

奇异值 singular value

奇异值分解 singular value decomposition

奇异向量 singular vector

跳跃连接 skip connection

慢特征分析 slow feature analysis

慢性原则 slowness principle

平滑 smoothing

平滑先验 smoothness prior softmax

softmax softmax 函数 softmax function

softmax 单元 softmax unit

softplus softplus

softplus 函数 softplus function

生成子空间 span

稀疏 sparse

稀疏激活 sparse activation

稀疏编码 sparse coding

稀疏连接 sparse connectivity

稀疏初始化 sparse initialization

稀疏交互 sparse interactions 稀疏权重 sparse weights

谱半径 spectral radius

语音识别 Speech Recognition sphering sphering

尖峰和平板 spike and slab

尖峰和平板 RBM spike and slab RBM

虚假模态 spurious modes

方阵 square

标准差 standard deviation

标准差 standard error

标准正态分布 standard normal distribution

声明 statement

平稳的 stationary

平稳分布 Stationary Distribution

驻点 stationary point

统计效率 statistic efficiency

统计学习理论 statistical learning theory

统计量 statistics

最陡下降 steepest descent

随机 stochastic

随机课程 stochastic curriculum

随机梯度上升 Stochastic Gradient Ascent

随机梯度下降 stochastic gradient descent

随机矩阵 Stochastic

Matrix 随机最大似然估计 stochastic maximum likelihood

流 stream

步幅 stride

结构学习 structure learning

结构化概率模型 structured probabilistic model

结构化变分推断 structured variational inference

亚原子 subatomic

子采样 subsample

求和法则 sum rule

和-积网络 sum-product network

监督 supervised

监督学习 supervised learning

监督学习算法 supervised learning algorithm

监督模型 supervised model

监督预训练 supervised pretraining

支持向量 support vector

代理损失函数 surrogate loss function

符号 symbol

符号表示 symbolic representation

对称 symmetric

切面距离 tangent distance 切平面 tangent plane

正切传播 tangent prop

目标 target

泰勒 taylor

导师驱动过程 teacher forcing

温度 temperature

回火转移 tempered transition

回火 tempering

张量 tensor

测试误差 test error

测试集 test set

碰撞情况 the collider case

绑定的权重 tied weights Tikhonov

正则 Tikhonov regularization

平铺卷积 tiled convolution

时延神经网络 time delay neural network

时间步进 time step Toeplitz

矩阵 Toeplitz matrix

标记 token

容差 tolerance

地质 ICA topographic ICA

训练误差 training error

训练集 training set

转录 transcribe

转录系统 transcription system

迁移学习 transfer learning

转移 transition

转置 transpose

三角不等式 triangle inequality

三角形化 triangulate

三角形化图 triangulated graph

三元语法 trigram

无偏 unbiased

无偏样本方差 unbiased sample variance

欠完备 undercomplete

欠定的 underdetermined

欠估计 underestimation

欠拟合 underfitting

欠拟合机制 underfitting regime

下溢 underflow

潜在 underlying

潜在成因 underlying cause

无向 undirected

无向模型 undirected Model

展开图 unfolded graph

展开 unfolding

均匀分布 uniform distribution

一元语法 unigram

单峰值 unimodal

单元 unit

单位范数 unit norm 单位向量 unit vector

万能近似定理 universal approximation theorem

万能近似器 universal approximator

万能函数近似器 universal function approximator

未标注 unlabeled

未归一化概率函数 unnormalized probability function

非共享卷积 unshared convolution

无监督 unsupervised

无监督学习 unsupervised learning

无监督学习算法 unsupervised learning algorithm

无监督预训练 unsupervised pretraining

有效 valid

验证集 validation set

梯度消失与爆炸问题 vanishing and exploding gradient problem

梯度消失 vanishing gradient Vapnik-Chervonenkis

维度 Vapnik-Chervonenkis dimension

变量消去 variable elimination

方差 variance

方差减小 variance reduction

变分自编码器 variational auto-encoder

变分导数 variational derivative

变分自由能 variational free energy

变分推断 variational inference

去噪 denoise

向量 vector

虚拟对抗样本 virtual adversarial example

虚拟对抗训练 virtual adversarial training

可见层 visible layer

V-结构 V-structure

醒眠 wake sleep warp warp

支持向量机 support vector machine

无向图模型 undirected graphical model

权重 weight

权重衰减 weight decay

权重比例推断规则 weight scaling inference rule

权重空间对称性 weight space symmetry

条件概率分布 conditional probability distribution

白化 whitening

宽度 width

赢者通吃 winner-take-all

正切传播 tangent propagation

流形正切分类器 manifold tangent classifier

词嵌入 word embedding

词义消歧 word-sense disambiguation

零数据学习 zero-data learning

零次学习 zero-shot learning

PreviousNotes NextNotes

Last updated 2 years ago