Gradient checking and advanced optimization

@@ Line 1: / Line 1: @@
 Backpropagation is a notoriously difficult algorithm to debug and get right,
-especially since many subtly buggy implementations of it---for example, one
+especially since many subtly buggy implementations of it&mdash;for example, one
 that has an off-by-one error in the indices and that thus only trains some of
-the layers of weights, or an implementation that omits the bias term---will
+the layers of weights, or an implementation that omits the bias term&mdash;will
 manage to learn something that can look surprisingly reasonable
 (while performing less well than a correct implementation).  Thus, even with a
@@ Line 108: / Line 108: @@
 to automatically search for a value of <math>\textstyle \theta</math> that minimizes <math>\textstyle J(\theta)</math>.  Algorithms
 such as L-BFGS and conjugate gradient can often be much faster than gradient descent.
+{{Sparse_Autoencoder}}
+{{Languages|梯度检验与高级优化|中文}}

Latest revision as of 12:40, 7 April 2013