Exercise:Sparse Autoencoder
From Ufldl
Line 1: | Line 1: | ||
+ | ==Download Related Reading== | ||
+ | * [http://nlp.stanford.edu/~socherr/sparseAutoencoder_2011new.pdf sparseae_reading.pdf] | ||
+ | * [http://www.stanford.edu/class/cs294a/cs294a_2011-assignment.pdf sparseae_exercise.pdf] | ||
+ | |||
==Sparse autoencoder implementation== | ==Sparse autoencoder implementation== | ||
Line 32: | Line 36: | ||
an 8×8 image patch from the selected image, and convert the image patch (either | an 8×8 image patch from the selected image, and convert the image patch (either | ||
in row-major order or column-major order; it doesn't matter) into a 64-dimensional | in row-major order or column-major order; it doesn't matter) into a 64-dimensional | ||
- | vector to get a training example <math>x \in \Re^{64} | + | vector to get a training example <math>x \in \Re^{64}.</math> |
Complete the code in <tt>sampleIMAGES.m</tt>. Your code should sample 10000 image | Complete the code in <tt>sampleIMAGES.m</tt>. Your code should sample 10000 image | ||
Line 112: | Line 116: | ||
We will use the L-BFGS algorithm. This is provided to you in a function called | We will use the L-BFGS algorithm. This is provided to you in a function called | ||
- | <tt>minFunc | + | <tt>minFunc</tt> (code provided by Mark Schmidt) included in the starter code. (For the purpose of this |
assignment, you only need to call minFunc with the default parameters. You do | assignment, you only need to call minFunc with the default parameters. You do | ||
not need to know how L-BFGS works.) We have already provided code in <tt>train.m</tt> | not need to know how L-BFGS works.) We have already provided code in <tt>train.m</tt> | ||
Line 131: | Line 135: | ||
should work, but feel free to play with different settings of the parameters as | should work, but feel free to play with different settings of the parameters as | ||
well. | well. | ||
+ | |||
+ | '''Implementational tip:''' Once you have your backpropagation implementation correctly computing the derivatives (as verified using gradient checking in Step 3), when you are now using it with L-BFGS to optimize <math>J_{\rm sparse}(W,b)</math>, make sure you're not doing gradient-checking on every step. Backpropagation can be used to compute the derivatives of <math>J_{\rm sparse}(W,b)</math> fairly efficiently, and if you were additionally computing the gradient numerically on every step, this would slow down your program significantly. | ||
+ | |||
===Step 5: Visualization=== | ===Step 5: Visualization=== | ||
Line 149: | Line 156: | ||
- | Our implementation took around | + | Our implementation took around 5 minutes to run on a fast computer. |
In case you end up needing to try out multiple implementations or | In case you end up needing to try out multiple implementations or | ||
different parameter values, be sure to budget enough time for debugging | different parameter values, be sure to budget enough time for debugging | ||
Line 165: | Line 172: | ||
[[Category:Exercises]] | [[Category:Exercises]] | ||
+ | |||
+ | |||
+ | {{Sparse_Autoencoder}} |