Gradient checking and advanced optimization

From Ufldl

Jump to: navigation, search

@@ Line 32: / Line 32: @@
 \frac{J(\theta+{\rm EPSILON}) - J(\theta-{\rm EPSILON})}{2 \times {\rm EPSILON}}
 \end{align}</math>
-In practice, we set {\rm EPSILON} to a small constant, say around <math>\textstyle 10^{-4}</math>.
+In practice, we set <math>{\rm EPSILON}</math> to a small constant, say around <math>\textstyle 10^{-4}</math>.
-(There's a large range of values of {\rm EPSILON} that should work well, but
+(There's a large range of values of <math>{\rm EPSILON}</math> that should work well, but
-we don't set {\rm EPSILON} to be "extremely" small, say <math>\textstyle 10^{-20}</math>,
+we don't set <math>{\rm EPSILON}</math> to be "extremely" small, say <math>\textstyle 10^{-20}</math>,
 as that would lead to numerical roundoff errors.)
@@ Line 68: / Line 68: @@
 and "0"s everywhere else).  So,
 <math>\textstyle \theta^{(i+)}</math> is the same as <math>\textstyle \theta</math>, except its <math>\textstyle i</math>-th element has been incremented
-by {\rm EPSILON}.  Similarly, let <math>\textstyle \theta^{(i-)} = \theta - {\rm EPSILON} \times \vec{e}_i</math> be the
+by <math>{\rm EPSILON}</math>.  Similarly, let <math>\textstyle \theta^{(i-)} = \theta - {\rm EPSILON} \times \vec{e}_i</math> be the
-corresponding vector with the <math>\textstyle i</math>-th element decreased by {\rm EPSILON}.
+corresponding vector with the <math>\textstyle i</math>-th element decreased by <math>{\rm EPSILON}</math>.
 We can now numerically verify <math>\textstyle g_i(\theta)</math>'s correctness by checking, for each <math>\textstyle i</math>,
 that:
@@ Line 84: / Line 84: @@
 \nabla_{b^{(l)}} J(W,b) &= \frac{1}{m} \Delta b^{(l)}.
 \end{align}</math>
-This result shows that the final block of psuedo-code in Section~\ref{sec-backprop} is indeed
+This result shows that the final block of psuedo-code in [[Backpropagation Algorithm]] is indeed
 implementing gradient descent.
 To make sure your implementation of gradient descent is correct, it is

Gradient checking and advanced optimization

From Ufldl

Revision as of 23:20, 22 April 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox