Exercise: Implement deep networks for digit classification

@@ Line 46: / Line 46: @@
 To help you check that your implementation is correct, you should also check your gradients on a synthetic small dataset. We have implemented <tt>checkStackedAECost.m</tt> to help you check your gradients. If this checks passes, you will have implemented fine-tuning correctly.
-'''Note:''' When adding the weight decay term to the cost, you should regularize '''all''' the weights in the network.
+'''Note:''' When adding the weight decay term to the cost, you should regularize only the softmax weights (do not regularize the weights that compute the hidden layer activations).
 '''Implementation Tip:''' It is always a good idea to implement the code modularly and check (the gradient of) each part of the code before writing the more complicated parts.

Revision as of 20:17, 16 May 2011