Exercise:Self-Taught Learning

Revision as of 18:28, 7 April 2011 (view source)

Maiyifan (Talk | contribs)

(→Step Two: Train the sparse autoencoder: Added instructions on saving weights to disk)

← Older edit

Revision as of 18:37, 7 April 2011 (view source)

Maiyifan (Talk | contribs)

Newer edit →

Line 11:

===Step One: Generate the input and test data sets===

-

Download and decompress <tt>self_taught_assign.zip</tt>, which contains starter code for this exercise. Additionally, you will need to download the datasets from the MNIST Handwritten Digit Database for this project. Download and decompress the files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt> to the <tt>mnist/</tt> path of the project folder. The functions to read data from the raw files have already been provided. You will need to write functions to separate the cases into separate sets containing the digits - to 4, and 5 to 9. Fill in your code in the function files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt>. The results of your function will be visualized, corresponding to the unlabelled training set, the labelled training set, and the test set. Ensure that the first row only contains digits 0 to 4, and the second and third only contains digits 5 to 9.

+

Download and decompress <tt>self_taught_assign.zip</tt>, which contains starter code for this exercise. Additionally, you will need to download the datasets from the MNIST Handwritten Digit Database for this project. Download and decompress the files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt> to the <tt>mnist/</tt> path of the project folder. The functions to read data from the raw files have already been provided. You will need to write functions to separate the cases into separate sets containing the digits 0 to 4, and 5 to 9. Fill in your code in the function files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt>. The results of your function will be visualized as three rows, corresponding to the unlabelled training set, the labelled training set, and the test set. Ensure that the first row only contains digits 0 to 4, and the second and third only contains digits 5 to 9.

-

~~IMAGE HERE~~

+

[[File:selfTaughtInput.png]]

===Step Two: Train the sparse autoencoder===

Line 21:

Hint: This step takes a very long time, so you might want to avoid running it on subsequent trials! To do so, after running this step, run <tt>save('theta.mat', 'theta');</tt> from the command line. Then modify one line near the top of the <tt>trainSelfTaught.m</tt> script to set <tt>loadTheta = true;</tt>. This skips the autoencoder training step and loads the saved weights from disk.

-

~~IMAGE HERE~~

+

[[File:selfTaughtFeatures.png]]

===Step Three: Training the logistic regression model===

Exercise:Self-Taught Learning

From Ufldl

Revision as of 18:37, 7 April 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 11: / Line 11: @@
 ===Step One: Generate the input and test data sets===
-Download and decompress <tt>self_taught_assign.zip</tt>, which contains starter code for this exercise. Additionally, you will need to download the datasets from the MNIST Handwritten Digit Database for this project. Download and decompress the files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt> to the <tt>mnist/</tt> path of the project folder. The functions to read data from the raw files have already been provided. You will need to write functions to separate the cases into separate sets containing the digits - to 4, and 5 to 9. Fill in your code in the function files  <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt>. The results of your function will be visualized, corresponding to the unlabelled training set, the labelled training set, and the test set. Ensure that the first row only contains digits 0 to 4, and the second and third only contains digits 5 to 9.
+Download and decompress <tt>self_taught_assign.zip</tt>, which contains starter code for this exercise. Additionally, you will need to download the datasets from the MNIST Handwritten Digit Database for this project. Download and decompress the files <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt> to the <tt>mnist/</tt> path of the project folder. The functions to read data from the raw files have already been provided. You will need to write functions to separate the cases into separate sets containing the digits 0 to 4, and 5 to 9. Fill in your code in the function files  <tt>getLabelledData.m</tt> and <tt>getUnlabelledData.m</tt>. The results of your function will be visualized as three rows, corresponding to the unlabelled training set, the labelled training set, and the test set. Ensure that the first row only contains digits 0 to 4, and the second and third only contains digits 5 to 9.
-IMAGE HERE
+[[File:selfTaughtInput.png]]
 ===Step Two: Train the sparse autoencoder===
@@ Line 21: / Line 21: @@
 Hint: This step takes a very long time, so you might want to avoid running it on subsequent trials! To do so, after running this step, run <tt>save('theta.mat', 'theta');</tt> from the command line. Then modify one line near the top of the <tt>trainSelfTaught.m</tt> script to set <tt>loadTheta = true;</tt>. This skips the autoencoder training step and loads the saved weights from disk.
-IMAGE HERE
+[[File:selfTaughtFeatures.png]]
 ===Step Three: Training the logistic regression model===