Whitening

From Ufldl

Jump to: navigation, search
(2D example)
 
Line 5: Line 5:
which is needed for some algorithms.  If we are training on images,
which is needed for some algorithms.  If we are training on images,
the raw input is redundant, since adjacent pixel values
the raw input is redundant, since adjacent pixel values
-
are highly correlated.  The goal of whitening is to make the input less redundant,  
+
are highly correlated.  The goal of whitening is to make the input less redundant; more formally,
-
so that our learning algorithms sees a training input where (i) the features are less
+
our desiderata are that our learning algorithms sees a training input where (i) the features are less
correlated with each other, and (ii) the features all have the same variance.
correlated with each other, and (ii) the features all have the same variance.
Line 21: Line 21:
[[File:PCA-rotated.png | 600px]]
[[File:PCA-rotated.png | 600px]]
-
The covariance matrix of this data is given by\footnote{Technically, many of the
+
The covariance matrix of this data is given by:
-
statements in this section about the "covariance" will be true only if the data
+
 
-
has zero mean.  In the rest of this section, we will take this assumption as
+
<math>\begin{align}
-
implicit in our statements.  However, even if the data's mean isn't exactly zero,
+
-
the intuitions we're presenting here still hold true, and so this isn't something
+
-
that you should worry about.}
+
-
:<math>\begin{align}
+
\begin{bmatrix}
\begin{bmatrix}
7.29 & 0  \\
7.29 & 0  \\
Line 33: Line 29:
\end{bmatrix}.
\end{bmatrix}.
\end{align}</math>
\end{align}</math>
 +
 +
(Note: Technically, many of the
 +
statements in this section about the "covariance" will be true only if the data
 +
has zero mean.  In the rest of this section, we will take this assumption as
 +
implicit in our statements.  However, even if the data's mean isn't exactly zero,
 +
the intuitions we're presenting here still hold true, and so this isn't something
 +
that you should worry about.)
 +
It is no accident that the diagonal values are <math>\textstyle \lambda_1</math> and <math>\textstyle \lambda_2</math>.   
It is no accident that the diagonal values are <math>\textstyle \lambda_1</math> and <math>\textstyle \lambda_2</math>.   
Further,  
Further,  
the off-diagonal entries are zero; thus,  
the off-diagonal entries are zero; thus,  
-
<math>\textstyle x_{{\rm rot},1}</math> and <math>\textstyle x_{{\rm rot},2}</math> are uncorrelated, satisfying one of
+
<math>\textstyle x_{{\rm rot},1}</math> and <math>\textstyle x_{{\rm rot},2}</math> are uncorrelated, satisfying one of our desiderata  
-
our desiderata for whitened data.
+
for whitened data (that the features be less correlated).
To make each of our input features have unit variance, we can simply rescale
To make each of our input features have unit variance, we can simply rescale
Line 46: Line 50:
\end{align}</math>
\end{align}</math>
Plotting <math>\textstyle x_{{\rm PCAwhite}}</math>, we get:
Plotting <math>\textstyle x_{{\rm PCAwhite}}</math>, we get:
-
\begin{center}
+
 
-
\includegraphics[width=0.6\maxfigwidth]{PCA-whitened.png}
+
[[File:PCA-whitened.png | 600px]]
-
\vspace*{-0.2in}
+
 
-
\end{center}
+
This data now has covariance equal to the identity matrix <math>\textstyle I</math>.  We say that
This data now has covariance equal to the identity matrix <math>\textstyle I</math>.  We say that
<math>\textstyle x_{{\rm PCAwhite}}</math> is our '''PCA whitened''' version of the data: The  
<math>\textstyle x_{{\rm PCAwhite}}</math> is our '''PCA whitened''' version of the data: The  
Line 55: Line 58:
unit variance.  
unit variance.  
-
\smallskip
 
-
\noindent
 
'''Whitening combined with dimensionality reduction.'''  
'''Whitening combined with dimensionality reduction.'''  
If you want to have data that is whitened and which is lower dimensional than
If you want to have data that is whitened and which is lower dimensional than
Line 62: Line 63:
<math>\textstyle x_{{\rm PCAwhite}}</math>.  When we combine PCA whitening with regularization
<math>\textstyle x_{{\rm PCAwhite}}</math>.  When we combine PCA whitening with regularization
(described later), the last few components of <math>\textstyle x_{{\rm PCAwhite}}</math> will be
(described later), the last few components of <math>\textstyle x_{{\rm PCAwhite}}</math> will be
-
nearly zero anyway, and thus can safely be dropped.  
+
nearly zero anyway, and thus can safely be dropped.
-
\smallskip
+
== ZCA Whitening ==  
-
\noindent
+
-
 
+
-
== ZCE Whitening ==  
+
Finally, it turns out that this way of getting the  
Finally, it turns out that this way of getting the  
data to have covariance identity <math>\textstyle I</math> isn't unique.  
data to have covariance identity <math>\textstyle I</math> isn't unique.  
Line 80: Line 78:
\end{align}</math>
\end{align}</math>
Plotting <math>\textstyle x_{\rm ZCAwhite}</math>, we get:  
Plotting <math>\textstyle x_{\rm ZCAwhite}</math>, we get:  
-
\begin{center}
+
 
-
\includegraphics[width=0.6\maxfigwidth]{ZCA-whitened.png}
+
[[File:ZCA-whitened.png | 600px]]
-
\vspace*{-0.2in}
+
 
-
\end{center}
+
It can be shown that out of all possible choices for <math>\textstyle R</math>,  
It can be shown that out of all possible choices for <math>\textstyle R</math>,  
this choice of rotation causes <math>\textstyle x_{\rm ZCAwhite}</math> to be as close as possible to the  
this choice of rotation causes <math>\textstyle x_{\rm ZCAwhite}</math> to be as close as possible to the  
Line 89: Line 86:
When using ZCA whitening (unlike PCA whitening), we usually keep all <math>\textstyle n</math> dimensions
When using ZCA whitening (unlike PCA whitening), we usually keep all <math>\textstyle n</math> dimensions
-
of the data, and do not try to reduce its dimension.  
+
of the data, and do not try to reduce its dimension.
== Regularizaton ==
== Regularizaton ==
When implementing PCA whitening or ZCA whitening in practice, sometimes some
When implementing PCA whitening or ZCA whitening in practice, sometimes some
of the eigenvalues <math>\textstyle \lambda_i</math> will be numerically close to 0, and thus the scaling
of the eigenvalues <math>\textstyle \lambda_i</math> will be numerically close to 0, and thus the scaling
-
step in Equation~(\ref{eqn-sqrtlambda}) above would involve dividing by a value close to zero, and may cause
+
step where we divide by <math>\sqrt{\lambda_i}</math> would involve dividing by a value close to zero; this
-
the data to blow up (take on large values) or otherwise be numerically unstable.  In practice, we  
+
may cause the data to blow up (take on large values) or otherwise be numerically unstable.  In practice, we  
-
implement the scaling step using  
+
therefore implement this scaling step using  
a small amount of regularization, and add a small constant <math>\textstyle \epsilon</math>  
a small amount of regularization, and add a small constant <math>\textstyle \epsilon</math>  
to the eigenvalues before taking their square root and inverse:
to the eigenvalues before taking their square root and inverse:
Line 103: Line 100:
\end{align}</math>
\end{align}</math>
When <math>\textstyle x</math> takes values around <math>\textstyle [-1,1]</math>, a value of <math>\textstyle \epsilon \approx 10^{-5}</math>
When <math>\textstyle x</math> takes values around <math>\textstyle [-1,1]</math>, a value of <math>\textstyle \epsilon \approx 10^{-5}</math>
-
might be typical. With this form of regularization, the features won't all
+
might be typical.  
For the case of images, adding <math>\textstyle \epsilon</math> here also has the effect of slightly smoothing (or low-pass
For the case of images, adding <math>\textstyle \epsilon</math> here also has the effect of slightly smoothing (or low-pass
Line 121: Line 118:
performed by ZCA.  This results in a less redundant representation of the input
performed by ZCA.  This results in a less redundant representation of the input
image, which is then transmitted to your brain.
image, which is then transmitted to your brain.
 +
 +
 +
 +
{{PCA}}
 +
 +
 +
{{Languages|白化|中文}}

Latest revision as of 13:20, 7 April 2013

Personal tools