|
|
Line 294: |
Line 294: |
| | | |
| If you are training your algorithm on images other than natural images (for example, images of handwritten characters, or images of single isolated objects centered against a white background), other types of normalization might be worth considering, and the best choice may be application dependent. But when training on natural images, using the per-image mean normalization as the normalization equations above would be a reasonable default. | | If you are training your algorithm on images other than natural images (for example, images of handwritten characters, or images of single isolated objects centered against a white background), other types of normalization might be worth considering, and the best choice may be application dependent. But when training on natural images, using the per-image mean normalization as the normalization equations above would be a reasonable default. |
- |
| |
- | == Non-natural images ==
| |
- |
| |
- | If you are training your algorithm on images other than natural images (for
| |
- | example, images of handwritten characters, or images of single isolated objects
| |
- | centered against a white background), other types of normalization might be
| |
- | worth considering, and the best choice may be application dependent. But
| |
- | when training on natural images, using the per-image mean normalization
| |
- | as in Equations~(\ref{eqn-normalize1}-\ref{eqn-normalize2})
| |
- | would be a reasonable default.
| |
- |
| |
- | For PCA to work, usually we want each of the features <math>\textstyle x_1, x_2, \ldots, x_n</math>
| |
- | to have a similar range of values to the others (and to have a mean close to
| |
- | zero). If you've used PCA on other applications before, you may therefore have
| |
- | separately pre-processed each feature to have zero mean and unit variance, by
| |
- | separately estimating the mean and variance of each feature <math>\textstyle x_j</math>. However,
| |
- | this isn't the pre-processing that we will apply to most types of images. Specifically,
| |
- | suppose we are training our algorithm on '''natural images''', so that <math>\textstyle x_j</math> is
| |
- | the value of pixel <math>\textstyle j</math>. By "natural images," we informally mean the type of image that
| |
- | a typical animal or person might see over their lifetime.\footnote{Usually we use
| |
- | images of outdoor scenes with grass, trees, etc., and cut out small (say 16x16) image
| |
- | patches randomly from these to train the algorithm. But in practice most
| |
- | feature learning algorithms are extremely robust to the exact type of image
| |
- | it is trained on, so most images taken with a normal camera, so long as they
| |
- | aren't excessively blurry or have strange artifacts, should work.}
| |
- | In this case, it makes little sense to estimate a separate mean and
| |
- | variance for each pixel, because the statistics in one part
| |
- | of the image should (theoretically) be the same as any other.
| |
- | This property of images is called '''stationarity'''.
| |
- |
| |
- | In detail, in order for PCA to work well, informally we require that (i) The
| |
- | features have approximately zero mean, and (ii) The different features have
| |
- | similar variances to each other. With natural images, (ii) is already
| |
- | satisfied even without variance normalization, and so we won't perform any
| |
- | variance normalization.
| |
- | (If you are training on audio data---say, on
| |
- | spectrograms---or on text data---say, bag-of-word vectors---we will usually not perform
| |
- | variance normalization either.)
| |
- | In fact, PCA is invariant to the scaling of
| |
- | the data, and will return the same eigenvectors regardless of the scaling of
| |
- | the input. More formally, if you multiply each feature vector <math>\textstyle x</math> by some
| |
- | positive number (thus scaling every feature in every training example by the
| |
- | same number), PCA's output eigenvectors will not change.
| |
- |
| |
- | So, we won't use variance normalization. The only normalization we need to
| |
- | perform then is mean normalization, to ensure that the features have a mean
| |
- | around zero. Depending on the application, very often we are not interested
| |
- | in how bright the overall input image is. For example, in object recognition
| |
- | tasks, the overall brightness of the image doesn't affect what objects
| |
- | there are in the image. More formally, we are not interested in the
| |
- | mean intensity value of an image patch; thus, we can subtract out this value,
| |
- | as a form of mean normalization.
| |
- |
| |
- | Concretely, if <math>\textstyle x^{(i)} \in \Re^{n}</math> are the (grayscale) intensity values of
| |
- | a 16x16 image patch (<math>\textstyle n=256</math>), we might normalize the intensity of each image
| |
- | <math>\textstyle x^{(i)}</math> as follows:
| |
- | \begin{align}
| |
- | \mu^{(i)} &:= \frac{1}{n} \sum_{j=1}^n x^{(i)}_j \\
| |
- | x^{(i)}_j &:= x^{(i)}_j - \mu^{(i)} \;\;\;\;\hbox{for all <math>\textstyle j</math>}
| |
- | \end{align}
| |
- | Note that the two steps above are done separately for each image <math>\textstyle x^{(i)}</math>,
| |
- | and that <math>\textstyle \mu^{(i)}</math> here is the mean intensity of the image <math>\textstyle x^{(i)}</math>. In particular,
| |
- | this is not the same thing as estimating a mean value separately for each pixel <math>\textstyle x_j</math>.
| |
- |
| |
- | If you are training your algorithm on images other than natural images (for
| |
- | example, images of handwritten characters, or images of single isolated objects
| |
- | centered against a white background), other types of normalization might be
| |
- | worth considering, and the best choice may be application dependent. But
| |
- | when training on natural images, using the per-image mean normalization
| |
- | as in Equations~(\ref{eqn-normalize1}-\ref{eqn-normalize2})
| |
- | would be a reasonable default.
| |
| | | |
| == PCA on Images == | | == PCA on Images == |