Linear Decoders
From Ufldl
Line 1: | Line 1: | ||
+ | == Sparse Autoencoder Recap == | ||
+ | |||
In the sparse autoencoder, we had 3 layers of neurons: an input layer, a hidden layer and an output layer. In our previous description | In the sparse autoencoder, we had 3 layers of neurons: an input layer, a hidden layer and an output layer. In our previous description | ||
of autoencoders (and of neural networks), every neuron in the neural network used the same activation function. | of autoencoders (and of neural networks), every neuron in the neural network used the same activation function. | ||
Line 19: | Line 21: | ||
While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is | While some datasets like MNIST fit well with this scaling of the output, this can sometimes be awkward to satisfy. For example, if one uses PCA whitening, the input is | ||
no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range. | no longer constrained to <math>[0,1]</math> and it's not clear what the best way is to scale the data to ensure it fits into the constrained range. | ||
+ | |||
+ | |||
+ | == Linear Decoder == | ||
One easy fix for this problem is to set <math>a^{(3)} = z^{(3)}</math>. Formally, this is achieved by having the output | One easy fix for this problem is to set <math>a^{(3)} = z^{(3)}</math>. Formally, this is achieved by having the output | ||
Line 56: | Line 61: | ||
Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the | Because the hidden layer is using a sigmoid (or tanh) activation <math>f</math>, in the equation above <math>f'(\cdot)</math> should still be the | ||
derivative of the sigmoid (or tanh) function. | derivative of the sigmoid (or tanh) function. | ||
+ | |||
+ | |||
+ | {{Languages|线性解码器|中文}} |