自编码算法与稀疏性

From Ufldl

Jump to: navigation, search
Line 30: Line 30:
[[Image:Autoencoder636.png|400px|center]]
[[Image:Autoencoder636.png|400px|center]]
 +
 +
【原文】
 +
 +
The autoencoder tries to learn a function <math>\textstyle h_{W,b}(x) \approx x</math>.  In other
 +
words, it is trying to learn an approximation to the identity function, so as
 +
to output <math>\textstyle \hat{x}</math> that is similar to <math>\textstyle x</math>.  The identity function seems a
 +
particularly trivial function to be trying to learn; but by placing constraints
 +
on the network, such as by limiting the number of hidden units, we can discover
 +
interesting structure about the data.  As a concrete example, suppose the
 +
inputs <math>\textstyle x</math> are the pixel intensity values from a <math>\textstyle 10 \times 10</math> image (100
 +
pixels) so <math>\textstyle n=100</math>, and there are <math>\textstyle s_2=50</math> hidden units in layer <math>\textstyle L_2</math>.  Note that
 +
we also have <math>\textstyle y \in \Re^{100}</math>.  Since there are only 50 hidden units, the
 +
network is forced to learn a ''compressed'' representation of the input.
 +
I.e., given only the vector of hidden unit activations <math>\textstyle a^{(2)} \in \Re^{50}</math>,
 +
it must try to '''reconstruct''' the 100-pixel input <math>\textstyle x</math>.  If the input were completely
 +
random---say, each <math>\textstyle x_i</math> comes from an IID Gaussian independent of the other
 +
features---then this compression task would be very difficult.  But if there is
 +
structure in the data, for example, if some of the input features are correlated,
 +
then this algorithm will be able to discover some of those correlations. In fact,
 +
this simple autoencoder often ends up learning a low-dimensional representation very similar
 +
to PCAs.
 +
 +
【初译】
 +
 +
自编码神经网络尝试学习一个 <math>\textstyle h_{W,b}(x) \approx x</math> 的函数。换句话说,它尝试逼近一个单位函数,从而使得输出 <math>\textstyle \hat{x}</math> 接近于输入 <math>\textstyle x</math> 。单位函数虽然看起来非常容易学习,但是当我们为自编码神经网络加入某些限制,比如限定隐藏神经元的数量,我们就可以从输入数据中发现一些有趣的结构。举例来说,假设某个自编码神经网络的输入 <math>\textstyle x</math> 是一张 <math>\textstyle 10 \times 10</math> 图像的像素值,于是 <math>\textstyle 10 \times 10</math> ,其隐层 <math>\textstyle L_2</math> 中有 <math>\textstyle s_2=50</math> 个隐藏神经元 。注意,输出是100维的 。由于只有50个隐藏神经元,我们迫使自编码神经网络去学习输入数据的压缩表示,因为它需要从50维的隐藏神经元激活度向量 中重构出100维的像素值输入 。如果网络的输入数据是完全随机的,比如每一个输入 都是一个跟其它特征完全无关的独立同分布高斯随机变量,那么这一压缩表示将会非常难学习。但是如果输入数据中隐含着一些特定的结构,比如某些输入特征是相关的,那么这一算法就可以发现输入数据中的这些相关性。事实上,这一简单的自编码神经网络通常可以学习出一个跟主元分析(PCA)结果非常相似的输入数据的低维表示。

Revision as of 11:45, 12 March 2013

Personal tools