Exercise:Convolution and Pooling
From Ufldl
Line 13: | Line 13: | ||
You will also need: | You will also need: | ||
- | * <tt> | + | * <tt>sparseAutoencoderLinearCost.m</tt> (and related functions) from [[Exercise:Learning_color_features_with_Sparse_Autoencoders]] |
* <tt>softmaxTrain.m</tt> (and related functions) from [[Exercise:Softmax Regression]] | * <tt>softmaxTrain.m</tt> (and related functions) from [[Exercise:Softmax Regression]] | ||
''If you have not completed the exercises listed above, we strongly suggest you complete them first.'' | ''If you have not completed the exercises listed above, we strongly suggest you complete them first.'' | ||
- | === | + | === Step 1: Learn color features === |
- | + | Learn a set of color features by working through [[Exercise:Learning_color_features_with_Sparse_Autoencoders]], we will be using these features in the next steps. You should learn 400 features and they should look like this: | |
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
- | + | ||
[[File:cnn_Features_Good.png|480px]] | [[File:cnn_Features_Good.png|480px]] | ||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | |||
- | === Step | + | === Step 2: Convolution and pooling === |
Now that you have learned features for small patches, you will convolved these learned features with the large images, and pool these convolved features for use in a classifier later. | Now that you have learned features for small patches, you will convolved these learned features with the large images, and pool these convolved features for use in a classifier later. | ||
- | ==== Step | + | ==== Step 2a: Convolution ==== |
Implement convolution, as described in [[feature extraction using convolution]], in the function <tt>cnnConvolve</tt> in <tt>cnnConvolve.m</tt>. Implementing convolution is somewhat involved, so we will guide you through the process below. | Implement convolution, as described in [[feature extraction using convolution]], in the function <tt>cnnConvolve</tt> in <tt>cnnConvolve.m</tt>. Implementing convolution is somewhat involved, so we will guide you through the process below. | ||
Line 150: | Line 125: | ||
Taking the preprocessing steps into account, the feature activations that you should compute is <math>\sigma(W(T(x-\bar{x})) + b)</math>, where <math>T</math> is the whitening matrix and <math>\bar{x}</math> is the mean patch. Expanding this, you obtain <math>\sigma(WTx - WT\bar{x} + b)</math>, which suggests that you should convolve the images with <math>WT</math> rather than <math>W</math> as earlier, and you should add <math>(b - WT\bar{x})</math>, rather than just <math>b</math> to <tt>convolvedFeatures</tt>, before finally applying the sigmoid function. | Taking the preprocessing steps into account, the feature activations that you should compute is <math>\sigma(W(T(x-\bar{x})) + b)</math>, where <math>T</math> is the whitening matrix and <math>\bar{x}</math> is the mean patch. Expanding this, you obtain <math>\sigma(WTx - WT\bar{x} + b)</math>, which suggests that you should convolve the images with <math>WT</math> rather than <math>W</math> as earlier, and you should add <math>(b - WT\bar{x})</math>, rather than just <math>b</math> to <tt>convolvedFeatures</tt>, before finally applying the sigmoid function. | ||
- | ==== Step | + | ==== Step 2b: Checking ==== |
We have provided some code for you to check that you have done the convolution correctly. The code randomly checks the convolved values for a number of (feature, row, column) tuples by computing the feature activations for the selected features and patches directly using the sparse autoencoder. | We have provided some code for you to check that you have done the convolution correctly. The code randomly checks the convolved values for a number of (feature, row, column) tuples by computing the feature activations for the selected features and patches directly using the sparse autoencoder. | ||
- | ==== Step | + | ==== Step 2c: Pooling ==== |
Implement [[pooling]] in the function <tt>cnnPool</tt> in <tt>cnnPool.m</tt>. | Implement [[pooling]] in the function <tt>cnnPool</tt> in <tt>cnnPool.m</tt>. | ||
- | === Step | + | === Step 3: Use pooled features for classification === |
Once you have implemented pooling, you will use the pooled features to train a softmax classifier to map the pooled features to the class labels. The code in this section uses <tt>softmaxTrain</tt> from the softmax exercise to train a softmax classifier on the pooled features for 500 iterations, which should take around 5 minutes. | Once you have implemented pooling, you will use the pooled features to train a softmax classifier to map the pooled features to the class labels. The code in this section uses <tt>softmaxTrain</tt> from the softmax exercise to train a softmax classifier on the pooled features for 500 iterations, which should take around 5 minutes. | ||
- | === Step | + | === Step 4: Test classifier === |
Now that you have a trained softmax classifier, you can see how well it performs on the test set. This section contains code that will load the test set (which is a smaller part of the STL10 dataset, specifically, 3200 rescaled 64x64 images from 4 different classes) and obtain the pooled, convolved features for the images using the functions <tt>cnnConvolve</tt> and <tt>cnnPool</tt> which you wrote earlier, as well as the preprocessing matrices <tt>ZCAWhite</tt> and <tt>meanImage</tt> which were computed earlier in preprocessing the training images. These pooled features will then be run through the softmax classifier, and the accuracy of the predictions will be computed. You should expect to get an accuracy of around 77-78%. | Now that you have a trained softmax classifier, you can see how well it performs on the test set. This section contains code that will load the test set (which is a smaller part of the STL10 dataset, specifically, 3200 rescaled 64x64 images from 4 different classes) and obtain the pooled, convolved features for the images using the functions <tt>cnnConvolve</tt> and <tt>cnnPool</tt> which you wrote earlier, as well as the preprocessing matrices <tt>ZCAWhite</tt> and <tt>meanImage</tt> which were computed earlier in preprocessing the training images. These pooled features will then be run through the softmax classifier, and the accuracy of the predictions will be computed. You should expect to get an accuracy of around 77-78%. |