Backpropagation Algorithm

Revision as of 23:26, 26 February 2011 (view source)

Ang (Talk | contribs)

← Older edit

Revision as of 23:53, 1 March 2011 (view source)

Ang (Talk | contribs)

m

Newer edit →

Line 71:

partial derivatives of the cost function <math>J(W,b;x,y)</math> defined with respect

to a single example <math>(x,y)</math>.

-

Once we can compute these,

+

Once we can compute these, we see that

-

~~then by referring to Equation~(\ref{eqn-costfunction})~~, we see that

+

the derivative of the overall cost function <math>J(W,b)</math> can be computed as

:<math>\begin{align}

Line 126:

Line 125:

The algorithm can then be written:

-

: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using ~~Equations~(\ref{eqn-forwardprop1}-\ref{eqn-forwardprop2})~~.

+

: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using the equations defining the forward propagation steps.

: 2. For the output layer (layer <math>\textstyle n_l</math>), set

::<math>\begin{align}

Line 159:

Line 158:

: 1. Set <math>\textstyle \Delta W^{(l)} := 0</math>, <math>\textstyle \Delta b^{(l)} := 0</math> (matrix/vector of zeros) for all <math>\textstyle l</math>.

: 2. For <math>\textstyle i = 1</math> to <math>\textstyle m</math>,

-

:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and \\

+

:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and

<math>\textstyle \nabla_{b^{(l)}} J(W,b;x,y)</math>.

:: 2b. Set <math>\textstyle \Delta W^{(l)} := \Delta W^{(l)} + \nabla_{W^{(l)}} J(W,b;x,y)</math>.

Backpropagation Algorithm

From Ufldl

Revision as of 23:53, 1 March 2011

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 71: / Line 71: @@
 partial derivatives of the cost function <math>J(W,b;x,y)</math> defined with respect
 to a single example <math>(x,y)</math>.
-Once we can compute these,
+Once we can compute these, we see that
-then by referring to Equation~(\ref{eqn-costfunction}), we see that
 the derivative of the overall cost function <math>J(W,b)</math> can be computed as
 :<math>\begin{align}
@@ Line 126: / Line 125: @@
 The algorithm can then be written:
-: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using Equations~(\ref{eqn-forwardprop1}-\ref{eqn-forwardprop2}).
+: 1. Perform a feedforward pass, computing the activations for layers <math>\textstyle L_2</math>, <math>\textstyle L_3</math>, up to the output layer <math>\textstyle L_{n_l}</math>, using the equations defining the forward propagation steps.
 : 2. For the output layer (layer <math>\textstyle n_l</math>), set
 ::<math>\begin{align}
@@ Line 159: / Line 158: @@
 : 1. Set <math>\textstyle \Delta W^{(l)} := 0</math>, <math>\textstyle \Delta b^{(l)} := 0</math> (matrix/vector of zeros) for all <math>\textstyle l</math>.
 : 2. For <math>\textstyle i = 1</math> to <math>\textstyle m</math>,
-:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and \\
+:: 2a. Use backpropagation to compute <math>\textstyle \nabla_{W^{(l)}} J(W,b;x,y)</math> and
 <math>\textstyle \nabla_{b^{(l)}} J(W,b;x,y)</math>.
 :: 2b. Set <math>\textstyle \Delta W^{(l)} := \Delta W^{(l)} + \nabla_{W^{(l)}} J(W,b;x,y)</math>.