# Exercise:Independent Component Analysis

(Difference between revisions)
 Revision as of 21:26, 18 June 2011 (view source)Cyfoo (Talk | contribs) (→Step 4a: Orthonormal ICA)← Older edit Revision as of 17:44, 19 June 2011 (view source)Cyfoo (Talk | contribs) Newer edit → Line 5: Line 5: In the file [http://ufldl.stanford.edu/wiki/resources/independent_component_analysis_exercise.zip independent_component_analysis_exercise.zip] we have provided some starter code. You should write your code at the places indicated "YOUR CODE HERE" in the files. In the file [http://ufldl.stanford.edu/wiki/resources/independent_component_analysis_exercise.zip independent_component_analysis_exercise.zip] we have provided some starter code. You should write your code at the places indicated "YOUR CODE HERE" in the files. - For this exercise, you will need to modify '''OrthonormalICACost.m''', '''ReconstructionICACost.m''' and '''ICAExercise.m'''. + For this exercise, you will need to modify '''OrthonormalICACost.m''' and '''ICAExercise.m'''. === Dependencies === === Dependencies === Line 32: Line 32: === Step 3: Implement and check ICA cost functions === === Step 3: Implement and check ICA cost functions === - In this step, you should implement the two ICA cost functions: + In this step, you should implement the ICA cost function: -
+ orthonormalICACost in orthonormalICACost.m, which computes the cost and gradient for the orthonormal ICA objective. Note that the orthonormality constraint is '''not''' enforced in the cost function. It will be enforced by a projection in the gradient descent step, which you will have to complete in step 4. -
1. orthonormalICACost in orthonormalICACost.m, which computes the cost and gradient for the orthonormal ICA objective. Note that the orthonormality constraint is '''not''' enforced in the cost function. It will be enforced by a projection in the gradient descent step, which you will have to complete in step 4a. + -
2. reconstructionICACost in reconstructionICACost.m, which computes the cost and gradient for the reconstruction ICA objective. + -

## Independent Component Analysis

In this exercise, you will implement Independent Component Analysis on color images from the STL-10 dataset.

In the file independent_component_analysis_exercise.zip we have provided some starter code. You should write your code at the places indicated "YOUR CODE HERE" in the files.

For this exercise, you will need to modify OrthonormalICACost.m and ICAExercise.m.

### Dependencies

You will need:

The following additional file is also required for this exercise:

If you have not completed the exercises listed above, we strongly suggest you complete them first.

### Step 0: Initialization

In this step, we initialize some parameters used for the exercise.

### Step 1: Sample patches

In this step, we load and use a portion of the 8x8 patches from the STL-10 dataset (which you first saw in the exercise on linear decoders).

### Step 2: ZCA whiten patches

In this step, we ZCA whiten the patches as required by orthonormal ICA.

### Step 3: Implement and check ICA cost functions

In this step, you should implement the ICA cost function: orthonormalICACost in orthonormalICACost.m, which computes the cost and gradient for the orthonormal ICA objective. Note that the orthonormality constraint is not enforced in the cost function. It will be enforced by a projection in the gradient descent step, which you will have to complete in step 4.

When you have implemented the cost function, you should check the gradients numerically.

Hint - if you are having difficulties deriving the gradients, you may wish to consult the page on deriving gradients using the backpropagation idea.

#### Step 4: Optimization

In step 4, you will optimize for the orthonormal ICA objective using gradient descent with backtracking line search (the code for which has already been provided for you. For more details on the backtracking line search, you may wish to consult the appendix of this exercise). The orthonormality constraint should be enforced with a projection, which you should fill in.

Once you have filled in the code for the projection, check that it is correct by using the verification code provided. Once you have verified that your projection is correct, comment out the verification code and run the optimization. 10 000 iterations of gradient descent should take around 2 hours, and produce a basis which looks like the following:

Observe that few of the bases have been completely learned even after 10 000 iterations, highlighting a weakness of orthonormal ICA - it is difficult to optimize for the objective while enforcing the orthonormality constraint using gradient descent, and convergence can be very slow. Hence, in situations where an orthonormal basis is not required, other faster methods of learning bases (such as sparse coding) may be preferable.

### Appendix

#### Backtracking line search

The backtracking line search used in the exercise is based off that in Convex Optimization by Boyd and Vandenbergh. In the backtracking line search, given a descent direction $\vec{u}$ (in this exercise we use $\vec{u} = -\nabla f(\vec{x})$), we want to find a good step size t that gives us a steep descent. The general idea is to use a linear approximation (the first order Taylor approximation) to the function f at the current point $\vec{x}$, and to search for a step size t such that we can decrease the function's value by more than α times the decrease predicted by the linear approximation ($\alpha \in (0, 0.5)$. For more details, you may wish to consult the book.