Deriving gradients using the backpropagation idea

From Ufldl

Jump to: navigation, search

@@ Line 227: / Line 227: @@
 To have <math>J(z^{(4)}) = F(x)</math>, we can set <math>J(z^{(4)}) = \sum_k J(z^{(4)}_k)</math>.
-Now that we can see <math>F</math> as a neural network, we can try to compute the gradient <math>\nabla_W F</math>. However, we now face the difficulty that <math>W</math> appears twice in the network. Fortunately, it turns out that if <math>W</math> appears multiple times in the network, the gradient with respect to <math>W</math> is simply the sum of gradients for each <math>W</math> in the network (you may wish to work out a formal proof of this fact to convince yourself). With this in mind, we can proceed to work out the deltas first:
+Now that we can see <math>F</math> as a neural network, we can try to compute the gradient <math>\nabla_W F</math>. However, we now face the difficulty that <math>W</math> appears twice in the network. Fortunately, it turns out that if <math>W</math> appears multiple times in the network, the gradient with respect to <math>W</math> is simply the sum of gradients for each instance of <math>W</math> in the network (you may wish to work out a formal proof of this fact to convince yourself). With this in mind, we will proceed to work out the deltas first:
 <table align="center">
@@ Line 258: / Line 258: @@
 </table>
-First we find the gradients with respect to each <math>W</math>.
+To find the gradients with respect to <math>W</math>, first we find the gradients with respect to each instance of <math>W</math> in the network.
 With respect to <math>W^T</math>:
@@ Line 284: / Line 284: @@
 \end{align}
 </math>
+{{Languages|用反向传导思想求导|中文}}

Deriving gradients using the backpropagation idea

From Ufldl

Latest revision as of 04:26, 8 April 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox