Data Preprocessing

From Ufldl

Jump to: navigation, search
Line 2: Line 2:
Data preprocessing plays a very important in many deep learning algorithms. In practice, many methods work best after the data has been normalized and whitened. However, the exact parameters for data preprocessing are usually not immediately apparent unless one has much experience working with the algorithms. In this page, we hope to demystify some of the preprocessing methods and also provide tips (and a "standard pipeline") for preprocessing data.
Data preprocessing plays a very important in many deep learning algorithms. In practice, many methods work best after the data has been normalized and whitened. However, the exact parameters for data preprocessing are usually not immediately apparent unless one has much experience working with the algorithms. In this page, we hope to demystify some of the preprocessing methods and also provide tips (and a "standard pipeline") for preprocessing data.
 +
 +
Tip: When approaching a dataset, the first thing to do is to look at the data itself and observe its properties. While the techniques here apply generally, you might want to opt to do certain things differently given your dataset. For example, one standard preprocessing trick is to subtract the mean of each data point from itself (also known as remove DC, local mean subtraction, subtractive normalization). While this makes sense for data such as natural images, it is less obvious for data with with a natural "zero" point such as MNIST images (where all data points use the same value of 0 to represent an empty background).
 +
== Feature Normalization ==
== Feature Normalization ==
-
D
 
== PCA/ZCA Whitening ==
== PCA/ZCA Whitening ==

Revision as of 05:48, 29 April 2011

Personal tools