Autoencoders and Sparsity

Revision as of 01:25, 22 April 2011 (view source)

m (Cleaned up quotes)

Revision as of 23:22, 22 April 2011 (view source)

Line 24:

features---then this compression task would be very difficult. But if there is

structure in the data, for example, if some of the input features are correlated,

-

then this algorithm will be able to discover some of those correlations.~~\footnote{~~In fact,

+

then this algorithm will be able to discover some of those correlations. In fact,

this simple autoencoder often ends up learning a low-dimensional representation very similar

-

to PCA's.}

+

to PCA's.

Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small. But

Line 39:

its output value is close to 1, or as being "inactive" if its output value is

close to 0. We would like to constrain the neurons to be inactive most of the

-

time.~~\footnote{~~This discussion assumes a sigmoid activation function. If you are

+

time. This discussion assumes a sigmoid activation function. If you are

using a tanh activation function, then we think of a neuron as being inactive

-

when it outputs values close to -1.}

+

when it outputs values close to -1.

Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the

Revision as of 23:22, 22 April 2011

@@ Line 24: / Line 24: @@
 features---then this compression task would be very difficult.  But if there is
 structure in the data, for example, if some of the input features are correlated,
-then this algorithm will be able to discover some of those correlations.\footnote{In fact,
+then this algorithm will be able to discover some of those correlations. In fact,
 this simple autoencoder often ends up learning a low-dimensional representation very similar
-to PCA's.}
+to PCA's.
 Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small.  But
@@ Line 39: / Line 39: @@
 its output value is close to 1, or as being "inactive" if its output value is
 close to 0.  We would like to constrain the neurons to be inactive most of the
-time.\footnote{This discussion assumes a sigmoid activation function.  If you are
+time. This discussion assumes a sigmoid activation function.  If you are
 using a tanh activation function, then we think of a neuron as being inactive
-when it outputs values close to -1.}
+when it outputs values close to -1.
 Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the