# Visualizing a Trained Autoencoder

### From Ufldl

(6 intermediate revisions not shown) | |||

Line 6: | Line 6: | ||

a^{(2)}_i = f\left(\sum_{j=1}^{100} W^{(1)}_{ij} x_j + b^{(1)}_i \right). | a^{(2)}_i = f\left(\sum_{j=1}^{100} W^{(1)}_{ij} x_j + b^{(1)}_i \right). | ||

\end{align}</math> | \end{align}</math> | ||

- | + | <!-- This is the activation function <math>\textstyle g(\cdot)</math> applied to an affine function of the input.!--> | |

We will visualize the function computed by hidden unit <math>\textstyle i</math>---which depends on the | We will visualize the function computed by hidden unit <math>\textstyle i</math>---which depends on the | ||

parameters <math>\textstyle W^{(1)}_{ij}</math> (ignoring | parameters <math>\textstyle W^{(1)}_{ij}</math> (ignoring | ||

- | the bias term for now) using a 2D image. In particular, we think of | + | the bias term for now)---using a 2D image. In particular, we think of |

- | <math>\textstyle a^{( | + | <math>\textstyle a^{(2)}_i</math> as some non-linear feature of the input <math>\textstyle x</math>. |

We ask: | We ask: | ||

What input image <math>\textstyle x</math> would cause | What input image <math>\textstyle x</math> would cause | ||

- | <math>\textstyle a^{( | + | <math>\textstyle a^{(2)}_i</math> to be maximally activated? |

- | + | (Less formally, what is the feature that hidden unit <math>\textstyle i</math> is looking for?) | |

For this question to have a non-trivial answer, | For this question to have a non-trivial answer, | ||

we must impose some constraints on <math>\textstyle x</math>. If we suppose that | we must impose some constraints on <math>\textstyle x</math>. If we suppose that | ||

Line 34: | Line 34: | ||

When we do this for a sparse autoencoder (trained with 100 hidden units on | When we do this for a sparse autoencoder (trained with 100 hidden units on | ||

- | 10x10 pixel inputs we get the following result: | + | 10x10 pixel inputs<sup>1</sup> we get the following result: |

[[Image:ExampleSparseAutoencoderWeights.png|thumb|400px|center]] | [[Image:ExampleSparseAutoencoderWeights.png|thumb|400px|center]] | ||

Line 48: | Line 48: | ||

domains too. | domains too. | ||

- | ''The learned features were obtained by training on '''whitened''' natural images. Whitening is a preprocessing step which removes redundancy in the input, by causing adjacent pixels to become less correlated.'' | + | ---- |

+ | |||

+ | <sup>1</sup> ''The learned features were obtained by training on '''whitened''' natural images. Whitening is a preprocessing step which removes redundancy in the input, by causing adjacent pixels to become less correlated.'' | ||

+ | |||

+ | |||

+ | {{Sparse_Autoencoder}} | ||

+ | |||

+ | |||

+ | {{Languages|可视化自编码器训练结果|中文}} |

## Latest revision as of 12:49, 7 April 2013

Having trained a (sparse) autoencoder, we would now like to visualize the function learned by the algorithm, to try to understand what it has learned. Consider the case of training an autoencoder on images, so that . Each hidden unit computes a function of the input:

We will visualize the function computed by hidden unit ---which depends on the parameters (ignoring the bias term for now)---using a 2D image. In particular, we think of as some non-linear feature of the input . We ask: What input image would cause to be maximally activated? (Less formally, what is the feature that hidden unit is looking for?) For this question to have a non-trivial answer, we must impose some constraints on . If we suppose that the input is norm constrained by , then one can show (try doing this yourself) that the input which maximally activates hidden unit is given by setting pixel (for all 100 pixels, ) to

By displaying the image formed by these pixel intensity values, we can begin to understand what feature hidden unit is looking for.

If we have an autoencoder with 100 hidden units (say), then we our visualization will have 100 such images---one per hidden unit. By examining these 100 images, we can try to understand what the ensemble of hidden units is learning.

When we do this for a sparse autoencoder (trained with 100 hidden units on
10x10 pixel inputs^{1} we get the following result:

Each square in the figure above shows the (norm bounded) input image that maximally actives one of 100 hidden units. We see that the different hidden units have learned to detect edges at different positions and orientations in the image.

These features are, not surprisingly, useful for such tasks as object recognition and other vision tasks. When applied to other input domains (such as audio), this algorithm also learns useful representations/features for those domains too.

^{1} *The learned features were obtained by training on whitened natural images. Whitening is a preprocessing step which removes redundancy in the input, by causing adjacent pixels to become less correlated.*

Neural Networks | Backpropagation Algorithm | Gradient checking and advanced optimization | Autoencoders and Sparsity | **Visualizing a Trained Autoencoder** | Sparse Autoencoder Notation Summary | Exercise:Sparse Autoencoder

Language : 中文