神经网络向量化

From Ufldl

Jump to: navigation, search

@@ Line 84: / Line 84: @@
 (对于自编码网络，我们只需令<math>y^{(i)} = x^{(i)}</math>即可,  但这里考虑的是更一般的情况。)
+假定我们的输出有<math>s_3</math>维，因而每个样本的类别标号向量就记为<math>y^{(i)} \in \Re^{s_3}</math> 。在我们的Matlab/Octave数据结构实现中，把这些输出按列合在一起形成一个Matlab/Octave风格变量<tt>y</tt>，其中第<tt>i</tt>列<tt>y(:,i)</tt>就是y(i) 。
-Suppose we have <math>s_3</math> dimensional outputs, so that our target labels are <math>y^{(i)} \in \Re^{s_3}</math>.  In our Matlab/Octave datastructure, we will stack these in columns to form a Matlab/Octave variable <tt>y</tt>, so that the <math>i</math>-th column <tt>y(:,i)</tt> is <math>y^{(i)}</math>.
+现在我们要计算梯度项<math>\nabla_{W^{(l)}} J(W,b)</math>和<math>\nabla_{b^{(l)}} J(W,b)</math>。对于梯度中的第一项，就像过去在反向传播算法中所描述的那样，对于每个训练样本<math>(x,y)</math>，我们可以这样来计算：
-假定每个样本的输出有<math>s_3</math>维，因而每个样本的类别标号向量就记为<math>y^{(i)} \in \Re^{s_3}</math> 。在我们的Matlab/Octave数据结构实现中，把这些输出按列合在一起形成一个Matlab/Octave风格变量<tt>y</tt>，其中第<tt>i</tt>列<tt>y(:,i)</tt>就是y(i) 。
-We now want to compute the gradient terms
-<math>\nabla_{W^{(l)}} J(W,b)</math> and <math>\nabla_{b^{(l)}} J(W,b)</math>.  Consider the first of
-these terms.  Following our earlier description of the [[Backpropagation Algorithm]], we had that for a single training example <math>(x,y)</math>, we can compute the derivatives as
-'''初译：'''
-现在我们需要计算梯度项<math>\nabla_{W^{(l)}} J(W,b)</math>和<math>\nabla_{b^{(l)}} J(W,b)</math>。对于第一个梯度，按照我们过去在[[反向传导算法]]中所描述的，对于一个单一的训练样本<math>(x,y)</math>，我们可以计算导数：
-'''一审：'''
-现在我们要计算梯度项<math>\nabla_{W^{(l)}} J(W,b)</math>和<math>\nabla_{b^{(l)}} J(W,b)</math>。对于头一个梯度项，就像过去在反向传导算法中所描述的那样，对于每个训练样本<math>(x,y)</math>，我们可以这样来计算：
 ::<math>
@@ Line 108: / Line 97: @@
 </math>
+在这里 <math>\bullet</math>表示对两个向量按对应元素相乘的运算（译者注：其结果还是一个向量）。为了描述简单起见，我们这里暂时忽略对参数<math>b^{(l)}</math>.的求导， 不过在你真正实现反向传播时，还是需要计算关于它们的导数的。
-Here, <math>\bullet</math> denotes element-wise product.  For simplicity, our description here will ignore the derivatives with respect to <math>b^{(l)}</math>, though your implementation of backpropagation will have to compute those derivatives too.
+假定我们已经实现了向量化的正向传播方法，如前面9-5那样计算了矩阵形式的变量<tt>z2</tt>, <tt>a2</tt>,  <tt>z3</tt>和<tt>h</tt>，那么反向传播的非向量化版本可如下实现：
-'''初译：'''
-在这里，<math>\bullet</math> 表示元素级别的积。为了简化，我们这里的描述将要忽略b(l) . 不过在你真正实现反向传导时，你仍旧需要计算它们的导数。
-<math>b^{(l)}</math>
-'''一审：'''
-在这里 <math>\bullet</math>代表对两个向量按对应元素相乘的运算（其结果还是一个向量—译注）。为了描述简单起见，我们这里暂时忽略对参数<math>b^{(l)}</math>.的求导， 不过在你真正实现反向传导时，还是需要计算关于它们的导数。
-Suppose we have already implemented the vectorized forward propagation method, so that the matrix-valued <tt>z2</tt>, <tt>a2</tt>,  <tt>z3</tt> and <tt>h</tt> are computed as described above. We can then implement an ''unvectorized'' version of backpropagation as follows:
-'''初译：'''
-假定我们已经实现了正向传导方法的向量化，<tt>z2</tt>, <tt>a2</tt>,  <tt>z3</tt>和<tt>h</tt>的矩阵值就如我们前面描述的那样去计算。于是我们可以实现一个反向传导的非向量化版本如下：
-'''一审：'''
-假定我们已经实现了正向传导步骤的向量化，如前面9-5那样去计算了矩阵值变量<tt>z2</tt>, <tt>a2</tt>,  <tt>z3</tt>和<tt>h</tt>的值，那么反向传导的非向量化版本实现就如下所示：
 <syntaxhighlight>
@@ Line 138: / Line 113: @@
 </syntaxhighlight>
-This implementation has a <tt>for</tt> loop.  We would like to come up with an implementation that simultaneously performs backpropagation on all the examples, and eliminates this <tt>for</tt> loop.
+在这个实现中，有一个<tt>for</tt>循环，而我们想要一个能同时处理所有样本、且去除这个<tt>for</tt>循环的向量化版本。
-'''初译：'''
-在这个实现中有一个<tt>for</tt>循环。我们需要一个能够在所有样本上并发执行的实现从而去掉<tt>for</tt>循环。
-'''一审：'''
-在这个实现中，有一个<tt>for</tt>循环，而我们想要一个能同时处理所有样本的向量化实现版本，要去除这个<tt>for</tt>循环。
 To do so, we will replace the vectors <tt>delta3</tt> and <tt>delta2</tt> with matrices, where one column of each matrix corresponds to each training example.  We will also implement a function <tt>fprime(z)</tt> that takes as input a matrix <tt>z</tt>, and applies <math>f'(\cdot)</math> element-wise.  Each of the four lines of Matlab in the <tt>for</tt> loop above can then be vectorized and replaced with a single line of Matlab code (without a surrounding <tt>for</tt> loop).

神经网络向量化

From Ufldl

Revision as of 12:50, 16 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox