# Data Preprocessing

### From Ufldl

(→Audio (MFCC/Spectrograms)) |
|||

Line 86: | Line 86: | ||

=== Audio (MFCC/Spectrograms) === | === Audio (MFCC/Spectrograms) === | ||

- | For audio data (MFCC and Spectrograms), each dimension usually have different scales (variances). This is especially so when one includes the temporal derivatives (a common practice in audio processing). As a result, the preprocessing usually starts with simple data standardization (zero-mean, unit-variance per data dimension), followed by PCA/ZCA whitening (with an appropriate <tt>epsilon</tt>). | + | For audio data (MFCC and Spectrograms), each dimension usually have different scales (variances); the first component of MFCCs, for example, is the DC component and usually has a larger magnitude than the other components. This is especially so when one includes the temporal derivatives (a common practice in audio processing). As a result, the preprocessing usually starts with simple data standardization (zero-mean, unit-variance per data dimension), followed by PCA/ZCA whitening (with an appropriate <tt>epsilon</tt>). |

=== MNIST Handwritten Digits === | === MNIST Handwritten Digits === | ||

The MNIST dataset has pixel values in the range <math>[0, 255]</math>. We thus start with simple rescaling to shift the data into the range <math>[0, 1]</math>. In practice, removing the mean-value per example can also help feature learning. ''Note: While one could also elect to use PCA/ZCA whitening on MNIST if desired, this is not often done in practice.'' | The MNIST dataset has pixel values in the range <math>[0, 255]</math>. We thus start with simple rescaling to shift the data into the range <math>[0, 1]</math>. In practice, removing the mean-value per example can also help feature learning. ''Note: While one could also elect to use PCA/ZCA whitening on MNIST if desired, this is not often done in practice.'' |