Sparse Coding: Autoencoder Interpretation

From Ufldl

Jump to: navigation, search
Line 57: Line 57:
</ol>
</ol>
-
Observe that with our modified objective function, the objective function <math>J(A, s)</math> given <math>s</math>, that is <math>J(A; s) = \lVert As - x \rVert_2^2 + \gamma \lVert A \rVert_2^2</math> (the L1 term in <math>s</math> can be omitted since it is not a function of <math>A</math>) is simply a quadratic function of <math>A</math>, and hence has an easily derivable analytic solution in <math>A</math>. A quick way to derive this solution would be to use matrix calculus - some pages about matrix calculus can be found in the [[useful links]] section.
+
Observe that with our modified objective function, the objective function <math>J(A, s)</math> given <math>s</math>, that is <math>J(A; s) = \lVert As - x \rVert_2^2 + \gamma \lVert A \rVert_2^2</math> (the L1 term in <math>s</math> can be omitted since it is not a function of <math>A</math>) is simply a quadratic term in <math>A</math>, and hence has an easily derivable analytic solution in <math>A</math>. A quick way to derive this solution would be to use matrix calculus - some pages about matrix calculus can be found in the [[Useful Links | useful links]] section.
Optimizing for this objective function using the iterative method as above, will yield features (the basis vectors of <math>A</math>) similar to those learned using the sparse autoencoder. For more practical tips on implementing sparse coding, you may wish to refer to [[Exercise:Sparse Coding | the sparse coding exercise]]. For assistance with deriving the gradients, you may wish to refer to [[Deriving gradients using the backpropagation idea]].
Optimizing for this objective function using the iterative method as above, will yield features (the basis vectors of <math>A</math>) similar to those learned using the sparse autoencoder. For more practical tips on implementing sparse coding, you may wish to refer to [[Exercise:Sparse Coding | the sparse coding exercise]]. For assistance with deriving the gradients, you may wish to refer to [[Deriving gradients using the backpropagation idea]].

Revision as of 05:40, 29 May 2011

Personal tools