Autoencoders and Sparsity
From Ufldl
m (Cleaned up quotes) |
|||
Line 24: | Line 24: | ||
features---then this compression task would be very difficult. But if there is | features---then this compression task would be very difficult. But if there is | ||
structure in the data, for example, if some of the input features are correlated, | structure in the data, for example, if some of the input features are correlated, | ||
- | then this algorithm will be able to discover some of those correlations. | + | then this algorithm will be able to discover some of those correlations. In fact, |
this simple autoencoder often ends up learning a low-dimensional representation very similar | this simple autoencoder often ends up learning a low-dimensional representation very similar | ||
- | to PCA's. | + | to PCA's. |
Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small. But | Our argument above relied on the number of hidden units <math>\textstyle s_2</math> being small. But | ||
Line 39: | Line 39: | ||
its output value is close to 1, or as being "inactive" if its output value is | its output value is close to 1, or as being "inactive" if its output value is | ||
close to 0. We would like to constrain the neurons to be inactive most of the | close to 0. We would like to constrain the neurons to be inactive most of the | ||
- | time. | + | time. This discussion assumes a sigmoid activation function. If you are |
using a tanh activation function, then we think of a neuron as being inactive | using a tanh activation function, then we think of a neuron as being inactive | ||
- | when it outputs values close to -1. | + | when it outputs values close to -1. |
Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the | Recall that <math>\textstyle a^{(2)}_j</math> denotes the activation of hidden unit <math>\textstyle j</math> in the |