Implementing PCA/Whitening

Revision as of 05:28, 29 April 2011 (view source)

Latest revision as of 13:22, 7 April 2013 (view source)

Line 6:

We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using

-

avg = mean(x, 1);

+

avg = mean(x, 1); % Compute the mean pixel intensity value separately for each patch.

x = x - repmat(avg, size(x, 1), 1);

-

Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can ~~instead~~ compute this in one fell swoop as

+

Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as

sigma = x * x' / size(x, 2);

Line 26:

Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:

-

xRot = U~~(:,1:k)~~' * x; % ~~k is number~~ of ~~eigenvectors to keep~~

+

xRot = U' * x; % rotated version of the data.

-

xTilde = U(:,1:k) * ~~xRot~~; % ~~which corresponds to~~ the ~~# dimensions after reduction~~

+

xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,

-

% ~~set~~ k ~~= size(x, 1)~~ to keep ~~all the eigenvectors~~

+

% where k is the number of eigenvectors to keep

This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.

Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized

implementation, and the expressions

-

above work too for computing <math>x_{rot}</math> and <math>\tilde{x}</math> for your entire training set

+

above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set

all in one go. The resulting

-

<math>~~xrot~~</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.

+

<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.

To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use

Line 49:

xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;

+

Implementing PCA/Whitening

From Ufldl

Latest revision as of 13:22, 7 April 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 6: / Line 6: @@
 We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
-  avg = mean(x, 1);
+  avg = mean(x, 1);     % Compute the mean pixel intensity value separately for each patch.
   x = x - repmat(avg, size(x, 1), 1);
-Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can instead compute this in one fell swoop as
+Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as
   sigma = x * x' / size(x, 2);
@@ Line 26: / Line 26: @@
 Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
-  xRot = U(:,1:k)' * x;     % k is number of eigenvectors to keep
+  xRot = U' * x;          % rotated version of the data.
-  xTilde = U(:,1:k) * xRot; % which corresponds to the # dimensions after reduction
+  xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,
-                           % set k = size(x, 1) to keep all the eigenvectors
+                         % where k is the number of eigenvectors to keep
 This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.
 Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
 implementation, and the expressions
-above work too for computing <math>x_{rot}</math> and <math>\tilde{x}</math> for your entire training set
+above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set
 all in one go.  The resulting
-<math>xrot</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.
+<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.
 To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use
@@ Line 49: / Line 49: @@
   xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
+{{PCA}}
+{{Languages|实现主成分分析和白化|中文}}