Backpropagation vectorization hints
From Ufldl
for
Backpropagation vectorization hints
Jump to:
navigation
,
search
Here, we give a few hints on how to vectorize the Backpropagation step. The hints here specifically build on our earlier description of [[Neural Network Vectorization|how to vectorize a neural network]]. Assume we have already implemented the vectorized forward propagation steps, so that the matrix-valued <tt>z2</tt>, <tt>a2</tt>, <tt>z3</tt> and <tt>h</tt> have already been computed. Here was our unvectorized implementation of backprop: <syntaxhighlight> gradW1 = zeros(size(W1)); gradW2 = zeros(size(W2)); for i=1:m, delta3 = -(y(:,i) - h(:,i)) .* fprime(z3(:,i)); delta2 = W2'*delta3(:,i) .* fprime(z2(:,i)); gradW2 = gradW2 + delta3*a2(:,i)'; gradW1 = gradW1 + delta2*a1(:,i)'; end; </syntaxhighlight> Assume that we have implemented a version of <tt>fprime(z)</tt> that accepts matrix-valued inputs. We will use matrix-valued <tt>delta3</tt>, <tt>delta2</tt>. Here, <tt>delta3</tt> and <tt>delta2</tt> will have <math>m</math> columns, with one column per training example. We want to compute <tt>delta3</tt>, <tt>delta2</tt>, <tt>gradW2</tt> and <tt>gradW1</tt>. Consider the computation for the matrix <tt>delta3</tt>, which can now be written: <syntaxhighlight> for i=1:m, delta3(:,i) = -(y(:,i) - h(:,i)) .* fprime(z3(:,i)); end; </syntaxhighlight> Each iteration of the for loop computes one column of <tt>delta3</tt>. You should be able to find a single line of Matlab to compute <tt>delta3</tt> as a function of the matrices <tt>y</tt>, <tt>h</tt> and <tt>z3</tt>. This lets you compute <tt>delta3</tt>. Similarly, you should also be able to find a single line of code to compute the entire matrix <tt>delta2</tt>, as a function of <tt>W2</tt>, <tt>delta3</tt> (which is now a matrix), and <tt>z2</tt>. Next, consider the computation for <tt>gradW2</tt>. We can now write this as: <syntaxhighlight> gradW2 = zeros(size(W2)); for i=1:m, gradW2 = gradW2 + delta3(:,i)*a2(:,i)'; end; </syntaxhighlight> You should be able to find a single line of Matlab that replaces this <tt>for</tt> loop, and computes <tt>gradW2</tt> as a function of the matrices <tt>delta3</tt> and <tt>a2</tt>. If you're having trouble, take another look at the [[Logistic Regression Vectorization Example]], which uses a related (but slightly different) vectorization step to get to the final implementation. Using a similar method, you will also be able to compute <tt>gradW1</tt> with a single line of code. When you complete the derivation, you should be able to replace the unvectorized backpropagation code example above with just 4 lines of Matlab/Octave code.
Return to
Backpropagation vectorization hints
.
Views
Page
Discussion
View source
History
Personal tools
Log in
ufldl resources
UFLDL Tutorial
Recommended Readings
wiki
Main page
Recent changes
Random page
Help
Search
Toolbox
What links here
Related changes
Special pages