Implementing PCA/Whitening

Revision as of 03:28, 6 April 2011 (view source)

Cyfoo (Talk | contribs)

m (sigma bug)

← Older edit

Revision as of 03:30, 6 April 2011 (view source)

Cyfoo (Talk | contribs)

Newer edit →

Line 4:

First, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as

-

~~Sigma~~ = x * x' / size(x, 2);

+

sigma = x * x' / size(x, 2);

(Check the math yourself for correctness.)

Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).

-

Next, PCA computes the eigenvectors of {\tt Sigma}. One could do this using the Matlab <~~math~~>eig</~~math~~> function. However, because <math>Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <~~math~~>svd</~~math~~> function. Concretely, if you implement

+

Next, PCA computes the eigenvectors of <math>\Sigma</math>. One could do this using the Matlab <tt>eig</tt> function. However, because <math>\Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <tt>svd</tt> function. Concretely, if you implement

-

[U,S,V] = svd(~~Sigma~~);

+

[U,S,V] = svd(sigma);

then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column, sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order). The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.

-

(Note: The <~~math~~>svd</~~math~~> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues. A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)

+

(Note: The <tt>svd</tt> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues. A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)

Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:

Line 25:

Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized

implementation, and the expressions

-

above work too for computing <math>~~xrot~~</math> and <math>~~xtilde~~</math> for your entire training set

+

above work too for computing <math>x_{rot}</math> and <math>\tilde{x}</math> for your entire training set

all in one go. The resulting

-

<math>xrot</math> and <math>~~xtilde~~</math> will have one column corresponding to each training example.

+

<math>xrot</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.

To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use

-

~~\begin{verbatim}~~

+

-

xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;

+

xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;

-

~~\end{verbatim}~~

+

Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,

this turns out to be a compact way

Implementing PCA/Whitening

From Ufldl

Revision as of 03:30, 6 April 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 4: / Line 4: @@
 First, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as
-  Sigma = x * x' / size(x, 2);
+  sigma = x * x' / size(x, 2);
 (Check the math yourself for correctness.)
 Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).
-Next, PCA computes the eigenvectors of {\tt Sigma}.  One could do this using the Matlab <math>eig</math> function.  However, because <math>Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <math>svd</math> function. Concretely, if you implement
+Next, PCA computes the eigenvectors of <math>\Sigma</math>.  One could do this using the Matlab <tt>eig</tt> function.  However, because <math>\Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <tt>svd</tt> function. Concretely, if you implement
-  [U,S,V] = svd(Sigma);
+  [U,S,V] = svd(sigma);
 then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column,  sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order).  The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.
-(Note: The <math>svd</math> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues.  A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)
+(Note: The <tt>svd</tt> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues.  A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)
 Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
@@ Line 25: / Line 25: @@
 Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
 implementation, and the expressions
-above work too for computing <math>xrot</math> and <math>xtilde</math> for your entire training set
+above work too for computing <math>x_{rot}</math> and <math>\tilde{x}</math> for your entire training set
 all in one go.  The resulting
-<math>xrot</math> and <math>xtilde</math> will have one column corresponding to each training example.
+<math>xrot</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.
 To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use
-\begin{verbatim}
-xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;
+ xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;
-\end{verbatim}
 Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,
 this turns out to be a compact way