Exercise:PCA and Whitening

From Ufldl

Jump to: navigation, search
(PCA, PCA whitening and ZCA implementation)
Line 1: Line 1:
== PCA, PCA whitening and ZCA implementation ==
== PCA, PCA whitening and ZCA implementation ==
 +
 +
In this problem set, you will implement PCA, PCA whitening and ZCA, as described in the lecture notes.
 +
 +
In the file <tt>...</tt>, we have provided some starter code in MATLAB. You should write your code at the places indicated in the files ("YOUR CODE HERE"). You will only have to modify <tt>pca_gen.m</tt>.
=== Step 0: Load data ===
=== Step 0: Load data ===
-
[[Image:raw_images.png|thumb|left|240px|alt=Raw images|Raw images]]
+
The starter code contains code to load some natural images and sample 10000 14x14 patches from them. The raw patches sampled from the images will look something like this:
 +
 
 +
[[File:raw_images.png|240px|alt=Raw images|Raw images]]
 +
 
 +
These patches are stored as column vectors <math>x^{(i)} \in \mathbb{R}^{196}</math> in the 196 * 10000 matrix <math>x</math>.
=== Step 1: Implement PCA ===
=== Step 1: Implement PCA ===
Line 9: Line 17:
==== Step 1a: Implement PCA ====
==== Step 1a: Implement PCA ====
-
Implement PCA to obtain xRot, the matrix in which the data is expressed with respect to the eigenbasis of sigma, which is the matrix U.
+
In this step, you will implement PCA to obtain <math>x_{rot}</math>, the matrix in which the data is "rotated" to the basis comprising the principal components (i.e. the eigenbasis of <math>\Sigma</math>).
==== Step 1b: Check covariance ====
==== Step 1b: Check covariance ====
-
The covariance matrix for the data expressed with respect to the basis U should be a diagonal matrix with non-zero entries only along the main diagonal. We will verify this here. Write code to compute the covariance matrix, covar. When visualised as an image, you should see a straight line across the diagonal (non-zero entries) against a blue background (zero entries).
+
To verify that your implementation of PCA is correct, you should check the covariance matrix for the rotated data. PCA guarantees that the covariance matrix for the rotated data is a diagonal matrix (a matrix with non-zero entries only along the main diagonal). Implement code to compute the covariance matrix and verify this property. One way to do this is to compute the covariance matrix, and visualise it using the MATLAB command <tt>imagesc</tt>. The image should show a multicoloured diagonal line against a blue background.
-
Visualise the covariance matrix. You should see a line across the diagonal against a blue background.
+
[[File:pca_covar.png|360px]]
-
 
+
-
[[Image:pca_covar.png|thumb|left|240px]]
+
=== Step 2: Find number of components to retain ===
=== Step 2: Find number of components to retain ===
-
Write code to determine k, the number of components to retain in order to retain at least 99% of the variance.
+
In the next step, you will find <math>k</math>, the number of components to retain in order to retain at least 99% of the variance. In the step after this, you will discard all but <math>k</math> principal components, reducing the dimension of the original data to <math>k</math>.
=== Step 3: PCA with dimension reduction ===
=== Step 3: PCA with dimension reduction ===
-
Now that you have found k, you can reduce the dimension of the data by discarding the remaining dimensions. In this way, you can represent the data in k dimensions instead of the original 196, which will save you computational time when running learning algorithms on the reduced representation.
+
Now that you have found <math>k</math>, you can reduce the dimension of the data by discarding the remaining dimensions. In this way, you can represent the data in <math>k</math> dimensions instead of the original 196, which will save you computational time when running learning algorithms on the reduced representation.
-
Following the dimension reduction, invert the PCA transformation to produce the matrix xHat, the dimension-reduced data with respect to the original basis. Visualise the data and compare it to the raw data. You will observe that there is little loss due to throwing away the principal components that correspond to dimensions with low variation.
+
To see the effect of dimension reduction, invert the PCA transformation to produce the matrix <math>\hat{x}</math>, the dimension-reduced data with respect to the original basis. Visualise <math>\hat{x}</math> and compare it to the raw data, <math>x</math>. You will observe that there is little loss due to throwing away the principal components that correspond to dimensions with low variation. For comparison, you may also wish to generate and visualise <math>\hat{x}</math> for when only 50% of the variance is retained.
-
Visualise the data, and compare it to the raw data. You should observe that the raw and processed data are of comparable quality. For comparison, you may wish to generate a PCA reduced image which retains only 50% of the variance.
+
<table>
-
 
+
<tr>
-
[[Image:pca_images.png|thumb|left|240px|alt=PCA dimension-reduced images (99% variance)|PCA dimension-reduced images (99% variance)]] [[Image:raw_images.png|thumb|left|240px|alt=Raw images|Raw images]] [[Image:pca_images_50.png|thumb|left|240px|alt=PCA dimension-reduced images (50% variance)|PCA dimension-reduced images (50% variance)]]
+
<td>[[File:pca_images.png|240px|alt=PCA dimension-reduced images (99% variance)|PCA dimension-reduced images (99% variance)]]</td>
 +
<td>[[File:raw_images.png|240px|alt=Raw images|Raw images]]</td>
 +
<td>[[File:pca_images_50.png|240px|alt=PCA dimension-reduced images (50% variance)|PCA dimension-reduced images (50% variance)]]</td>
 +
</tr>
 +
<tr>
 +
<td>PCA dimension-reduced images<br /> (99% variance)</td>
 +
<td>Raw images <br /> &nbsp; </td>
 +
<td>PCA dimension-reduced images<br /> (50% variance)</td>
 +
</tr>
 +
</table>
=== Step 4: PCA with whitening and regularisation ===
=== Step 4: PCA with whitening and regularisation ===
Line 37: Line 52:
==== Step 4a: Implement PCA with whitening and regularisation ====
==== Step 4a: Implement PCA with whitening and regularisation ====
-
Implement PCA with whitening and regularisation to produce the matrix xPCAWhite.  
+
Now implement PCA with whitening and regularisation to produce the matrix <math>x_{PCAWhite}</math>.
==== Step 4b: Check covariance ====
==== Step 4b: Check covariance ====
-
PCA with whitening results a covariance matrix that is (approximately) equal to the identity matrix. We will verify this here. Write code to compute the covariance matrix, covar. When visualised as an image, you should see a red line across the diagonal (one entries) against a blue background (zero entries).
+
As with PCA alone, PCA with whitening results in processed data that has a diagonal covariance matrix. However, unlike PCA alone, whitening additionally ensures that the diagonal entries are (approximately) equal to 1, i.e. that the covariance matrix is (approximately) the identity matrix. To verify that your implementation of PCA with whitening is correct, you can check this property.Implement code to compute the covariance matrix and verify this property. As earlier, you can visualise the covariance matrix with <tt>imagesc</tt>. When visualised as an image, you should see a red line across the diagonal (corresponding to the one entries) against a blue background (corresponding to the zero entries).
-
Visualise the covariance matrix. You should see a red line across the diagonal against a blue background.
+
[[File:pca_whitened_covar.png|360px]]
-
 
+
-
[[Image:pca_whitened_covar.png|thumb|left|240px]]
+
=== Step 5: ZCA whitening ===
=== Step 5: ZCA whitening ===
-
Now implement ZCA whitening to produce the matrix xZCAWhite. Visualise the data and compare it to the raw data. You should observe that whitening results in, among other things, enhanced edges.
+
Now implement ZCA whitening to produce the matrix <math>x_{ZCAWhite}</math>. Visualise <math>x_{ZCAWhite}</math> and compare it to the raw data, <math>x</math>. You should observe that whitening results in, among other things, enhanced edges.
-
 
+
-
Visualise the data, and compare it to the raw data. You should observe that the whitened images have enhanced edges.
+
-
[[Image:zca_whitened_images.png|thumb|left|240px|alt=ZCA whitened images|ZCA whitened images]] [[Image:raw_images.png|thumb|left|240px|alt=Raw images|Raw images]]
+
<table>
 +
<tr>
 +
<td>
 +
[[File:zca_whitened_images.png|240px|alt=ZCA whitened images|ZCA whitened images]]  
 +
</td><td>
 +
[[File:raw_images.png|240px|alt=Raw images|Raw images]]
 +
</td>
 +
</tr>
 +
<tr>
 +
<td>ZCA whitened images</td>
 +
<td>Raw images</td>
 +
</tr>
 +
</table>

Revision as of 06:44, 4 April 2011

Personal tools