Data Preprocessing

From Ufldl

Jump to: navigation, search
Line 40: Line 40:
After doing the simple normalizations, whitening is often the next preprocessing step employed that helps make our algorithms work better. In practice, many deep learning algorithms rely on whitening to learn good features.
After doing the simple normalizations, whitening is often the next preprocessing step employed that helps make our algorithms work better. In practice, many deep learning algorithms rely on whitening to learn good features.
-
In performing PCA/ZCA whitening, it is pertinent to first zero-mean the features (across the dataset) to ensure that <math> \frac{1}{m} \sum_i x^{(i)} = 0 </math>. Specifically, this should be done before computing the covariance matrix. '''The only exception is when per-example mean subtraction is performed and the data is stationary across dimensions/pixels.'''
+
In performing PCA/ZCA whitening, it is pertinent to first zero-mean the features (across the dataset) to ensure that <math> \frac{1}{m} \sum_i x^{(i)} = 0 </math>. Specifically, this should be done before computing the covariance matrix. (The only exception is when per-example mean subtraction is performed and the data is stationary across dimensions/pixels.)
 +
Next, one needs to select the value of <math>epsilon</math> to use when performing [[Whitening | PCA/ZCA whitening]] (recall that this was the regularization term that has an effect of ''low-pass filtering'' the data). It turns out that selecting this value can also play an important role for feature learning, we discuss two cases for selecting <tt>epsilon</tt>:
 +
 +
=== Reconstruction Based Models ===
 +
 +
In models based on reconstruction (including Autoencoders, Sparse Coding, RBMs, k-Means), it is often preferable to set <tt>epsilon</tt> to a value such that low-pass filtering is achieved. One way to check this is to set a value for epsilon, run ZCA whitening, and thereafter visualize the data before and after whitening. If the value of epsilon is set too low, the data will look very noisy; conversely, if epsilon is set too high, you will see a "blurred" version of the original data.
 +
 +
{{Quote|
 +
Tip: If your data has been scaled reasonably (e.g., to <math>[0, 1]</math>), start with <math>epsilon = 0.01</math> or <math>epsilon = 0.1</math>.
 +
 +
=== ICA-based Models (with orthogonalization) ===
 +
 +
For ICA-based models with orthogonalization, it is ''very'' important for the data to be as close to white (identity covariance) as possible. This is a side-effect of using orthogonalization to decorrelate the features learned (more details in [[Independent Component Analysis | ICA]]). Hence, in this case, you will want to use an <tt>epsilon</tt> that is as small as possible (e.g., <math>epsilon = 1e-6</math>).
{{quote|
{{quote|
Line 48: Line 60:
== Large Images ==
== Large Images ==
-
1/f Whitening
+
For large images, PCA/ZCA based whitening methods are impractical as the covariance matrix is too large. For these cases, we defer to 1/f-whitening methods. (more details to come)

Revision as of 06:42, 29 April 2011

Personal tools