Backpropagation Algorithm

From Ufldl

Jump to: navigation, search
m
Line 71: Line 71:
partial derivatives of the cost function <math>J(W,b;x,y)</math> defined with respect
partial derivatives of the cost function <math>J(W,b;x,y)</math> defined with respect
to a single example <math>(x,y)</math>.
to a single example <math>(x,y)</math>.
-
Once we can compute these,
+
Once we can compute these, we see that
-
then by referring to Equation~(\ref{eqn-costfunction}), we see that
+
the derivative of the overall cost function <math>J(W,b)</math> can be computed as
the derivative of the overall cost function <math>J(W,b)</math> can be computed as
:<math>\begin{align}
:<math>\begin{align}
Line 126: Line 125:
The algorithm can then be written:
The algorithm can then be written:
-
: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using Equations~(\ref{eqn-forwardprop1}-\ref{eqn-forwardprop2}).
+
: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using the equations defining the forward propagation steps.
: 2. For the output layer (layer <math>\textstyle n_l</math>), set  
: 2. For the output layer (layer <math>\textstyle n_l</math>), set  
::<math>\begin{align}
::<math>\begin{align}
Line 159: Line 158:
: 1. Set <math>\textstyle \Delta W^{(l)} := 0</math>, <math>\textstyle \Delta b^{(l)} := 0</math> (matrix/vector of zeros) for all <math>\textstyle l</math>.
: 1. Set <math>\textstyle \Delta W^{(l)} := 0</math>, <math>\textstyle \Delta b^{(l)} := 0</math> (matrix/vector of zeros) for all <math>\textstyle l</math>.
: 2. For <math>\textstyle i = 1</math> to <math>\textstyle m</math>,  
: 2. For <math>\textstyle i = 1</math> to <math>\textstyle m</math>,  
-
:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and \\
+
:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and  
<math>\textstyle \nabla_{b^{(l)}} J(W,b;x,y)</math>.
<math>\textstyle \nabla_{b^{(l)}} J(W,b;x,y)</math>.
:: 2b. Set <math>\textstyle \Delta W^{(l)} := \Delta W^{(l)} + \nabla_{W^{(l)}} J(W,b;x,y)</math>.  
:: 2b. Set <math>\textstyle \Delta W^{(l)} := \Delta W^{(l)} + \nabla_{W^{(l)}} J(W,b;x,y)</math>.  

Revision as of 23:53, 1 March 2011

Personal tools