Exercise:Learning color features with Sparse Autoencoders

From Ufldl

Jump to: navigation, search
(Step 2: Learn features on small patches)
(Step 1: Modify your sparse autoencoder to use a linear decoder)
 
Line 1: Line 1:
== Learning color features with Sparse Autoencoders ==
== Learning color features with Sparse Autoencoders ==
-
In this exercise, you will implement a [[Linear Decoders | linear decoder]] (with the sparse autoencoder) to learn features on color images from the STL10 dataset. These features will be used in an later [[Exercise:Convolution and Pooling | exercise on convolution and pooling]] for classifying STL10 images.
+
In this exercise, you will implement a [[Linear Decoders | linear decoder]] (a sparse autoencoder whose output layer uses a linear activation function).  You will then apply it to learn features on color images from the STL-10 dataset. These features will be used in an later [[Exercise:Convolution and Pooling | exercise on convolution and pooling]] for classifying STL-10 images.
In the file <tt>[http://ufldl.stanford.edu/wiki/resources/linear_decoder_exercise.zip linear_decoder_exercise.zip]</tt> we have provided some starter code. You should write your code at the places indicated "YOUR CODE HERE" in the files.
In the file <tt>[http://ufldl.stanford.edu/wiki/resources/linear_decoder_exercise.zip linear_decoder_exercise.zip]</tt> we have provided some starter code. You should write your code at the places indicated "YOUR CODE HERE" in the files.
Line 9: Line 9:
=== Dependencies ===
=== Dependencies ===
-
The following additional files are required for this exercise:
+
You will need:
-
* [http://ufldl.stanford.edu/wiki/resources/stl10_patches_100k.zip STL10 patches]
+
-
 
+
-
You will also need:
+
* <tt>sparseAutoencoderCost.m</tt> (and related functions) from [[Exercise:Sparse Autoencoder]]
* <tt>sparseAutoencoderCost.m</tt> (and related functions) from [[Exercise:Sparse Autoencoder]]
 +
 +
The following additional file is also required for this exercise:
 +
* [http://ufldl.stanford.edu/wiki/resources/stl10_patches_100k.zip Sampled 8x8 patches from the STL-10 dataset (stl10_patches_100k.zip)]
''If you have not completed the exercise listed above, we strongly suggest you complete it first.''
''If you have not completed the exercise listed above, we strongly suggest you complete it first.''
Line 19: Line 19:
=== Learning from color image patches ===
=== Learning from color image patches ===
-
In all the exercises so far, you have been working only with grayscale images. In this exercise, you will get the opportunity to work with RGB color images for the first time.  
+
In all the exercises so far, you have been working only with grayscale images. In this exercise, you will get to work with RGB color images for the first time.  
Conveniently, the fact that an image has three color channels (RGB), rather than a single gray channel, presents little difficulty for the sparse autoencoder. You can just combine the intensities from all the color channels for the pixels into one long vector, as if you were working with a grayscale image with 3x the number of pixels as the original image.  
Conveniently, the fact that an image has three color channels (RGB), rather than a single gray channel, presents little difficulty for the sparse autoencoder. You can just combine the intensities from all the color channels for the pixels into one long vector, as if you were working with a grayscale image with 3x the number of pixels as the original image.  
Line 25: Line 25:
=== Step 0: Initialization ===
=== Step 0: Initialization ===
-
In this step, we initialize some parameters used in the exercise.
+
In this step, we initialize some parameters used in the exercise (see starter code for details).
=== Step 1: Modify your sparse autoencoder to use a linear decoder ===
=== Step 1: Modify your sparse autoencoder to use a linear decoder ===
-
Copy <tt>sparseAutoencoder.m</tt> to the directory for this exercise and rename it to <tt>sparseAutoencoderLinear.m</tt>. Rename the function <tt>sparseAutoencoderCost</tt> in the file to <tt>sparseAutoencoderLinearCost</tt>, and modify it to use a [[Linear Decoders | linear decoder]]. In particular, you should change the cost and gradients returned to reflect the change from a sigmoid to a linear decoder. After making this change, check your gradient to ensure that they are correct.
+
Copy <tt>sparseAutoencoderCost.m</tt> to the directory for this exercise and rename it to <tt>sparseAutoencoderLinearCost.m</tt>. Rename the function <tt>sparseAutoencoderCost</tt> in the file to <tt>sparseAutoencoderLinearCost</tt>, and modify it to use a [[Linear Decoders | linear decoder]]. In particular, you should change the cost and gradients returned to reflect the change from a sigmoid to a linear decoder. After making this change, check your gradients to ensure that they are correct.
=== Step 2: Learn features on small patches ===
=== Step 2: Learn features on small patches ===
-
You will now use your sparse autoencoder to learn features on a set of 100 000 small 8x8 patches sampled from the larger 96x96 STL10 images (The [http://www.stanford.edu/~acoates//stl10/ STL10 dataset] comprises 5000 test and 8000 train 96x96 labelled color images belonging to one of ten classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck).  
+
You will now use your sparse autoencoder to learn features on a set of 100,000 small 8x8 patches sampled from the larger 96x96 STL-10 images (The [http://www.stanford.edu/~acoates//stl10/ STL-10 dataset] comprises 5000 training and 8000 test examples,  with each example being a 96x96 labelled color image belonging to one of ten classes: airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck.)
-
The code provided in this step trains your sparse autoencoder for 400 iterations with the default parameters initialized in step 0. This should take around 45 minutes. Your sparse autoencoder should learn features which when visualized, look like edges and opponent colors, as in the figure below.  
+
The code provided in this step trains your sparse autoencoder for 400 iterations with the default parameters initialized in step 0. This should take around 45 minutes. Your sparse autoencoder should learn features which when visualized, look like edges and "opponent colors," as in the figure below.  
[[File:CNN_Features_Good.png|480px]]
[[File:CNN_Features_Good.png|480px]]
-
If your parameters are improperly tuned (the default parameters should work), or if your implementation of the autoencoder is buggy, you might get one of the following images instead:
+
If your parameters are improperly tuned (the default parameters should work), or if your implementation of the autoencoder is buggy, you might instead get images that look like one of the following:
-
<table>
+
<table cellpadding=5px>
<tr><td>[[File:cnn_Features_Bad1.png|240px]]</td><td>[[File:cnn_Features_Bad2.png|240px]]</td></tr>
<tr><td>[[File:cnn_Features_Bad1.png|240px]]</td><td>[[File:cnn_Features_Bad2.png|240px]]</td></tr>
</table>
</table>
The learned features will be saved to <tt>STL10Features.mat</tt>, which will be used in the later [[Exercise:Convolution and Pooling | exercise on convolution and pooling]].
The learned features will be saved to <tt>STL10Features.mat</tt>, which will be used in the later [[Exercise:Convolution and Pooling | exercise on convolution and pooling]].

Latest revision as of 21:00, 21 June 2011

Personal tools