Implementing PCA/Whitening

Revision as of 16:07, 20 March 2013 (view source)

Kandeng (Talk | contribs)

← Older edit

Latest revision as of 13:22, 7 April 2013 (view source)

Kandeng (Talk | contribs)

Line 1:

-

~~版本作者 Email 说明~~

+

In this section, we summarize the PCA, PCA whitening and ZCA whitening algorithms,

+

and also describe how you can implement them using efficient linear algebra libraries.

-

0.~~1 @visualzhou visualzhou@gmail~~.~~com 初译，尽量保留原英文格式~~

+

First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.

-

0.~~2 @Emma_lzhang Emma.lzhang@gmail.com 一审校对~~

+

We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using

-

+

avg = mean(x, 1); % Compute the mean pixel intensity value separately for each patch.

-

+

-

~~在这一节里，我们将总结PCA, PCA白化和ZCA白化算法，并描述如何使用高效的线性代数库来实现它们。~~

+

-

+

-

首先，我们需要确保数据的均值（近似）为零。对于自然图像，我们通过减去每个图像块(patch)的均值（近似地）来达到这一目标。为此，我们计算每个图像块的均值，并从每个图像块中减去它的均值。（译注：参见PCA一章中“对图像数据应用PCA算法”一节）。Matlab实现如下：

+

-

+

-

avg = mean(x, 1); % ~~分别为每个图像块计算像素强度的均值。~~

+

x = x - repmat(avg, size(x, 1), 1);

-

~~下面，我们要计算~~ <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math> ~~，如果你在Matlab中实现（或者在C~~++, ~~Java等中实现，但可以使用高效的线性代数库），直接求和效率很低。不过，我们可以这样一气呵成。~~

+

Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>. If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as

sigma = x * x' / size(x, 2);

-

~~（自己推导一下看看）这里，我们假设~~ <math>x</math> ~~为一数据结构，其中每列表示一个训练样本（所以~~ <math>x</math> ~~是一个~~ <math>\textstyle n</math>×<math>\textstyle m</math> ~~的矩阵）。~~

+

(Check the math yourself for correctness.)

+

Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).

-

~~接下来，PCA计算~~ <math>\Sigma</math> ~~的特征向量。你可以使用Matlab的~~ <tt>eig</tt> ~~函数来计算。但是由于~~ <math>\Sigma</math> ~~是对称半正定的矩阵，用~~ <tt>svd</tt> ~~函数在数值计算上更加稳定。~~

+

Next, PCA computes the eigenvectors of <math>\Sigma</math>. One could do this using the Matlab <tt>eig</tt> function. However, because <math>\Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <tt>svd</tt> function. Concretely, if you implement

-

+

-

~~具体来说，如果你使用~~

+

[U,S,V] = svd(sigma);

-

~~那矩阵~~ <math>U</math> ~~将包含~~ <math>Sigma</math> ~~的特征向量（一个特征向量一列，从主向量开始排序），矩阵S 对角线上的元素将包含对应的特征值（同样降序排列）。矩阵~~ <math>~~\textstyle~~ V</math> 等于 <math>~~\textstyle~~ U</math> ~~的转置，可以忽略。~~

+

then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column, sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order). The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.

-

~~（注意~~ <tt>svd</tt> 函数实际上计算的是一个矩阵的奇异值和奇异向量，就对称半正定矩阵的特殊情况来说，它们对应于特征值和特征向量，这里我们也只关心这一特例。关于奇异向量和特征向量的详细讨论超出了本文范围。）

+

(Note: The <tt>svd</tt> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues. A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)

-

~~最后，我们可以这样计算~~<math>\textstyle x_{\rm rot}</math> 和 <math>\textstyle \tilde{x}</math> ：

+

Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:

-

xRot = U' * x; % ~~数据旋转后的结果。~~

+

xRot = U' * x; % rotated version of the data.

-

xTilde = U(:,1:k)' * x; % ~~数据降维后的结果，这里k希望保留的特征向量的数目。~~

+

xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,

+

% where k is the number of eigenvectors to keep

-

这以<math>\textstyle \tilde{x} \in \Re^k</math>~~的形式给出了数据的PCA表示。顺便说一下，如果~~ <math>x</math> ~~是一个包括所有训练数据的~~ <math>\textstyle n</math>×<math>\textstyle m</math> ~~矩阵，这也是一种向量化的实现方式，上面的式子可以让你一次对所有的训练样本计算出~~ <math>x_{\rm rot}</math> 和 <math>\tilde{x}</math> ~~。得到的~~ <math>x_{\rm rot}</math> 和 <math>\tilde{x}</math> ~~中，每列对应一个训练样本。~~

+

This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.

+

Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized

+

implementation, and the expressions

+

above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set

+

all in one go. The resulting

+

<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.

-

~~为计算PCA白化后的数据~~ <math>\textstyle x_{\rm PCAwhite}</math> ~~，可以用~~

+

To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use

xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;

-

因为 <math>S</math> ~~的对角线包括了特征值~~ <math>\textstyle \lambda_i</math> ~~，这其实就是同时为所有样本<math>\textstyle i</math>计算~~

+

Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,

-

<math>\textstyle x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}</math> ~~的简洁表达。~~

+

this turns out to be a compact way

+

of computing <math>\textstyle x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}</math>

+

simultaneously for all <math>\textstyle i</math>.

-

~~最后，你也可以这样计算ZCA白化后的数据~~<math>\textstyle x_{\rm ZCAwhite}</math>:

+

Finally, you can also compute the ZCA whitened data <math>\textstyle x_{\rm ZCAwhite}</math> as:

xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;

+

Implementing PCA/Whitening

From Ufldl

Latest revision as of 13:22, 7 April 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 1: / Line 1: @@
-版本	作者		Email			说明
+In this section, we summarize the PCA, PCA whitening and ZCA whitening algorithms,
+and also describe how you can implement them using efficient linear algebra libraries.
-.1	@visualzhou	visualzhou@gmail.com	初译，尽量保留原英文格式
+First, we need to ensure that the data has (approximately) zero-mean. For natural images, we achieve this (approximately) by subtracting the mean value of each image patch.
-.2	@Emma_lzhang	Emma.lzhang@gmail.com	一审校对
+We achieve this by computing the mean for each patch and subtracting it for each patch. In Matlab, we can do this by using
+  avg = mean(x, 1);     % Compute the mean pixel intensity value separately for each patch.
-在这一节里，我们将总结PCA, PCA白化和ZCA白化算法，并描述如何使用高效的线性代数库来实现它们。
-首先，我们需要确保数据的均值（近似）为零。对于自然图像，我们通过减去每个图像块(patch)的均值（近似地）来达到这一目标。为此，我们计算每个图像块的均值，并从每个图像块中减去它的均值。（译注：参见PCA一章中“对图像数据应用PCA算法”一节）。Matlab实现如下：
-  avg = mean(x, 1);     % 分别为每个图像块计算像素强度的均值。
   x = x - repmat(avg, size(x, 1), 1);
-下面，我们要计算 <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math> ，如果你在Matlab中实现（或者在C++, Java等中实现，但可以使用高效的线性代数库），直接求和效率很低。不过，我们可以这样一气呵成。
+Next, we need to compute <math>\textstyle \Sigma = \frac{1}{m} \sum_{i=1}^m (x^{(i)})(x^{(i)})^T</math>.  If you're implementing this in Matlab (or even if you're implementing this in C++, Java, etc., but have access to an efficient linear algebra library), doing it as an explicit sum is inefficient. Instead, we can compute this in one fell swoop as
   sigma = x * x' / size(x, 2);
-（自己推导一下看看）这里，我们假设 <math>x</math> 为一数据结构，其中每列表示一个训练样本（所以 <math>x</math> 是一个 <math>\textstyle n</math>×<math>\textstyle m</math> 的矩阵）。
+(Check the math yourself for correctness.)
+Here, we assume that <math>x</math> is a data structure that contains one training example per column (so, <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix).
-接下来，PCA计算 <math>\Sigma</math> 的特征向量。你可以使用Matlab的 <tt>eig</tt> 函数来计算。但是由于 <math>\Sigma</math> 是对称半正定的矩阵，用 <tt>svd</tt> 函数在数值计算上更加稳定。
+Next, PCA computes the eigenvectors of <math>\Sigma</math>.  One could do this using the Matlab <tt>eig</tt> function.  However, because <math>\Sigma</math> is a symmetric positive semi-definite matrix, it is more numerically reliable to do this using the <tt>svd</tt> function. Concretely, if you implement
-具体来说，如果你使用
   [U,S,V] = svd(sigma);
-那矩阵 <math>U</math> 将包含 <math>Sigma</math> 的特征向量（一个特征向量一列，从主向量开始排序），矩阵S 对角线上的元素将包含对应的特征值（同样降序排列）。矩阵 <math>\textstyle V</math> 等于 <math>\textstyle U</math> 的转置，可以忽略。
+then the matrix <math>U</math> will contain the eigenvectors of <math>Sigma</math> (one eigenvector per column,  sorted in order from top to bottom eigenvector), and the diagonal entries of the matrix <math>S</math> will contain the corresponding eigenvalues (also sorted in decreasing order).  The matrix <math>V</math> will be equal to transpose of <math>U</math>, and can be safely ignored.
-（注意 <tt>svd</tt> 函数实际上计算的是一个矩阵的奇异值和奇异向量，就对称半正定矩阵的特殊情况来说，它们对应于特征值和特征向量，这里我们也只关心这一特例。关于奇异向量和特征向量的详细讨论超出了本文范围。）
+(Note: The <tt>svd</tt> function actually computes the singular vectors and singular values of a matrix, which for the special case of a symmetric positive semi-definite matrix---which is all that we're concerned with here---is equal to its eigenvectors and eigenvalues.  A full discussion of singular vectors vs. eigenvectors is beyond the scope of these notes.)
-最后，我们可以这样计 算<math>\textstyle x_{\rm rot}</math> 和 <math>\textstyle \tilde{x}</math> ：
+Finally, you can compute <math>\textstyle x_{\rm rot}</math> and <math>\textstyle \tilde{x}</math> as follows:
-  xRot = U' * x;          % 数据旋转后的结果。
+  xRot = U' * x;          % rotated version of the data.
-  xTilde = U(:,1:k)' * x; % 数据降维后的结果，这里k希望保留的特征向量的数目。
+  xTilde = U(:,1:k)' * x; % reduced dimension representation of the data,
+                         % where k is the number of eigenvectors to keep
-这以<math>\textstyle \tilde{x} \in \Re^k</math>的形式给出了数据的PCA表示。顺便说一下，如果 <math>x</math> 是一个包括所有训练数据的 <math>\textstyle n</math>×<math>\textstyle m</math> 矩阵，这也是一种向量化的实现方式，上面的式子可以让你一次对所有的训练样本计算出 <math>x_{\rm rot}</math> 和 <math>\tilde{x}</math> 。得到的 <math>x_{\rm rot}</math> 和 <math>\tilde{x}</math> 中，每列对应一个训练样本。
+This gives your PCA representation of the data in terms of <math>\textstyle \tilde{x} \in \Re^k</math>.
+Incidentally, if <math>x</math> is a <math>\textstyle n</math>-by-<math>\textstyle m</math> matrix containing all your training data, this is a vectorized
+implementation, and the expressions
+above work too for computing <math>x_{\rm rot}</math> and <math>\tilde{x}</math> for your entire training set
+all in one go.  The resulting
+<math>x_{\rm rot}</math> and <math>\tilde{x}</math> will have one column corresponding to each training example.
-为计算PCA白化后的数据 <math>\textstyle x_{\rm PCAwhite}</math> ，可以用
+To compute the PCA whitened data <math>\textstyle x_{\rm PCAwhite}</math>, use
   xPCAwhite = diag(1./sqrt(diag(S) + epsilon)) * U' * x;
-因为 <math>S</math> 的对角线包括了特征值 <math>\textstyle \lambda_i</math> ，这其实就是同时为所有样本<math>\textstyle i</math>计算
+Since <math>S</math>'s diagonal contains the eigenvalues <math>\textstyle \lambda_i</math>,
- <math>\textstyle x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}</math> 的简洁表达。
+this turns out to be a compact way
+of computing <math>\textstyle x_{{\rm PCAwhite},i} = \frac{x_{{\rm rot},i} }{\sqrt{\lambda_i}}</math>
+simultaneously for all <math>\textstyle i</math>.
-最后，你也可以这样计算ZCA白化后的数据<math>\textstyle x_{\rm ZCAwhite}</math>:
+Finally, you can also compute the ZCA whitened data <math>\textstyle x_{\rm ZCAwhite}</math> as:
   xZCAwhite = U * diag(1./sqrt(diag(S) + epsilon)) * U' * x;
+{{PCA}}
+{{Languages|实现主成分分析和白化|中文}}