Exercise: Implement deep networks for digit classification

From Ufldl

Jump to: navigation, search
m (Step 2: Train the data on the second stacked autoencoder)
Line 34: Line 34:
We first forward propagate the training set through the first autoencoder (using <tt>feedForwardAutoencoder.m</tt> that you completed in [[Exercise:Self-Taught_Learning]]) to obtain hidden unit activations. These activations are then used to train the second sparse autoencoder. Since this is just an adapted application of a standard autoencoder, it should run similarly with the first. Complete this part of the code so as to learn a first layer of features using your <tt>sparseAutoencoderCost.m</tt> and minFunc.
We first forward propagate the training set through the first autoencoder (using <tt>feedForwardAutoencoder.m</tt> that you completed in [[Exercise:Self-Taught_Learning]]) to obtain hidden unit activations. These activations are then used to train the second sparse autoencoder. Since this is just an adapted application of a standard autoencoder, it should run similarly with the first. Complete this part of the code so as to learn a first layer of features using your <tt>sparseAutoencoderCost.m</tt> and minFunc.
-
=== Step 3: Implement fine-tuning ===
+
=== Step 3: Train the softmax classifier on the L2 features ===
-
To implement fine tuning, we need to consider all three layers as a single model. Implement <tt>stackedAECost.m</tt> to return the cost, gradient and predictions of the model. The cost function should be as defined as the log likelihood and a gradient decay term. The gradient should be computed using back-propogation as discussed earlier. The predictions should consist of the activations of the output layer of the softmax model.
+
Next, continue to forward propagate the L1 features through the second autoencoder (using <tt>feedForwardAutoencoder.m</tt>) to obtain the L2 hidden unit activations. These activations are then used to train the softmax classifier. You should be able to use <tt>softmaxTrain.m</tt> that you completed in [[Exercise:Softmax Regression]] to complete this part of the assignment.
-
To help you check that your implementation is correct, you can use the <tt>stackedAECheck.m</tt> script. The first part of the script runs the same input on your combined-model function, and on your separate autoencoder and softmax functions, and checks that they return the same cost and predictions. The second part of the script checks that the numerical gradient of the function is the same as your computed analytic gradient. If these two checks pass, you will have implemented fine-tuning correctly.
+
=== Step 4: Implement fine-tuning ===
-
'''Note:''' Recall that the cost function is given by:
+
To implement fine tuning, we need to consider all three layers as a single model. Implement <tt>stackedAECost.m</tt> to return the cost and gradient of the model. The cost function should be as defined as the log likelihood and a gradient decay term. The gradient should be computed using [[Backpropagation Algorithm | back-propogation as discussed earlier]]. The predictions should consist of the activations of the output layer of the softmax model.
-
<math>
+
To help you check that your implementation is correct, you should also check your gradients on a synthetic small dataset. We have implemented <tt>checkStackedAECost.m</tt> to help you check your gradients. If this checks passes, you will have implemented fine-tuning correctly.
-
\begin{align}
+
-
J(\theta) = -\ell(\theta) + w(\theta) \\
+
-
w(\theta) = \frac{\lambda}{2} \sum_{i}{ \sum_{j}{ \theta_{ij}^2 } } \\
+
-
\ell(\theta) = \theta^T_{y^{(i)}} x^{(i)} - \ln \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }}
+
-
\end{align}
+
-
</math>
+
-
When adding the weight decay term to the cost, only the weights for the topmost (softmax) layer need to be considered. Doing so does not impact the results adversely, but simplifies the implementation significantly.
+
{{Quote|
 +
Note: When adding the weight decay term to the cost, you should regularize '''all''' the weights in the network.
 +
}}
-
=== Step 4: Test the model ===
+
=== Step 5: Test the model ===
-
After completing these steps, running the entire script in stackedAETrain.m will perform layer-wise training of the stacked autoencoder, finetune the model, and measure its performance on the test set. If you've done all the steps correctly, you should get an accuracy of about X percent.
+
 
 +
Finally, you will need to classify with this model; complete the code in <tt>stackedAEPredict.m</tt> to classify using the stacked autoencoder with a classification layer.
 +
 
 +
After completing these steps, running the entire script in stackedAETrain.m will perform layer-wise training of the stacked autoencoder, finetune the model, and measure its performance on the test set. If you've done all the steps correctly, you should get an accuracy of about XX.X percent.

Revision as of 05:12, 10 May 2011

Personal tools