Implementing PCA/Whitening

From Ufldl

Jump to: navigation, search
m (sigma bug)
 
Line 2: Line 2:
and also describe how you can implement them using efficient linear algebra libraries.
and also describe how you can implement them using efficient linear algebra libraries.
-
First, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as
+
First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.
-
  Sigma = x * x' / size(x, 2);
+
We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
 +
 
 +
  avg = mean(x, 1);    % Compute the mean pixel intensity value separately for each patch.
 +
x = x - repmat(avg, size(x, 1), 1);
 +
 
 +
Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as
 +
 
 +
sigma = x * x' / size(x, 2);
(Check the math yourself for correctness.)  
(Check the math yourself for correctness.)  
Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).  
Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).  
-
Next, PCA computes the eigenvectors of {\tt Sigma}.  One could do this using the Matlab <math>eig</math> function.  However, because <math>Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <math>svd</math> function. Concretely, if you implement  
+
Next, PCA computes the eigenvectors of <math>\Sigma</math>.  One could do this using the Matlab <tt>eig</tt> function.  However, because <math>\Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <tt>svd</tt> function. Concretely, if you implement  
-
  [U,S,V] = svd(Sigma);
+
  [U,S,V] = svd(sigma);
then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column,  sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order).  The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.
then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column,  sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order).  The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.
-
(Note: The <math>svd</math> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues.  A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)
+
(Note: The <tt>svd</tt> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues.  A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)
Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
-
  xrot = U' * x;
+
  xRot = U' * x;         % rotated version of the data.
-
  xtilde = U(:,1:k)' * x;
+
  xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,
 +
                        % where k is the number of eigenvectors to keep
This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.  
This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.  
Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
implementation, and the expressions
implementation, and the expressions
-
above work too for computing <math>xrot</math> and <math>xtilde</math> for your entire training set
+
above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set
all in one go.  The resulting  
all in one go.  The resulting  
-
<math>xrot</math> and <math>xtilde</math> will have one column corresponding to each training example.  
+
<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.  
To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use  
To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use  
-
\begin{verbatim}
+
 
-
xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;
+
xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;
-
\end{verbatim}
+
 
Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,  
Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,  
this turns out to be a compact way  
this turns out to be a compact way  
Line 40: Line 48:
Finally, you can also compute the ZCA whitened data <math>\textstyle x_{\rm ZCAwhite}</math> as:
Finally, you can also compute the ZCA whitened data <math>\textstyle x_{\rm ZCAwhite}</math> as:
-
  xPCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
+
  xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
 +
 
 +
 
 +
{{PCA}}
 +
 
 +
 
 +
{{Languages|实现主成分分析和白化|中文}}

Latest revision as of 13:22, 7 April 2013

Personal tools