可视化自编码器训练结果

From Ufldl

Jump to: navigation, search
Line 14: Line 14:
<!-- This is the activation function <math>\textstyle g(\cdot)</math> applied to an affine function of the input.!-->
<!-- This is the activation function <math>\textstyle g(\cdot)</math> applied to an affine function of the input.!-->
:【原文】:
:【原文】:
-
We will visualize the function computed by hidden unit  ---which depends on the parameters   (ignoring the bias term for now)---using a 2D image. In particular, we think of   as some non-linear feature of the input . We ask: What input image   would cause   to be maximally activated? (Less formally, what is the feature that hidden unit   is looking for?) For this question to have a non-trivial answer, we must impose some constraints on . If we suppose that the input is norm constrained by , then one can show (try doing this yourself) that the input which maximally activates hidden unit   is given by setting pixel   (for all 100 pixels, ) to
+
:We will visualize the function computed by hidden unit  ---which depends on the parameters <math>\textstyle W^{(1)}_{ij}</math> (ignoring the bias term for now)---using a 2D image. In particular, we think of <math>\textstyle a^{(2)}_i</math> as some non-linear feature of the input <math>\textstyle x</math>. We ask: What input image <math>\textstyle x</math> would cause <math>\textstyle a^{(2)}_i</math> to be maximally activated? (Less formally, what is the feature that hidden unit <math>\textstyle i</math> is looking for?) For this question to have a non-trivial answer, we must impose some constraints on <math>\textstyle x</math>. If we suppose that the input is norm constrained by <math>\textstyle ||x||^2 = \sum_{i=1}^{100} x_i^2 \leq 1</math>, then one can show (try doing this yourself) that the input which maximally activates hidden unit <math>\textstyle i</math> is given by setting pixel <math>\textstyle x_j</math> (for all 100 pixels, <math>\textstyle j=1,\ldots, 100</math>) to
 +
:<math>\begin{align}
 +
x_j = \frac{W^{(1)}_{ij}}{\sqrt{\sum_{j=1}^{100} (W^{(1)}_{ij})^2}}.
 +
\end{align}</math>
【初译】:
【初译】:
我们将用2D图像对这个由隐藏单元i计算出的函数进行可视化,这个函数依赖于参数 (忽略掉偏置项b_i)。此时,如果我们将 理解为输入向量 的某个非线性特征值,我们需要思考:什么样的输入图像 会使得激励 取得最大值?(也就是说,隐藏单元i找到的是一个什么样的特征值?)。因为这个问题需要有一个有实际意义的解,所以我们必须对 加以限制。我们采用输入向量长度的平方 进行归一化限制,于是可以得到(请读者尝试自行推导。),当输入对隐藏单元产生最大的激励时,其输入像素 (对所有100个输入像素,j=1,…,100)所取的值应为:
我们将用2D图像对这个由隐藏单元i计算出的函数进行可视化,这个函数依赖于参数 (忽略掉偏置项b_i)。此时,如果我们将 理解为输入向量 的某个非线性特征值,我们需要思考:什么样的输入图像 会使得激励 取得最大值?(也就是说,隐藏单元i找到的是一个什么样的特征值?)。因为这个问题需要有一个有实际意义的解,所以我们必须对 加以限制。我们采用输入向量长度的平方 进行归一化限制,于是可以得到(请读者尝试自行推导。),当输入对隐藏单元产生最大的激励时,其输入像素 (对所有100个输入像素,j=1,…,100)所取的值应为:

Revision as of 11:05, 7 March 2013

Personal tools