神经网络

Revision as of 13:53, 8 March 2013 (view source)

Kandeng (Talk | contribs)

← Older edit

Revision as of 06:19, 9 March 2013 (view source)

Kandeng (Talk | contribs)

Newer edit →

Line 24:

-

【初译】神经元是一个计算单元，输入为<math>x_1, x_2, x_3</math> (a +1为截距项，注这里a为多余？校对者注：这里a是“一个”的意思，不是变量) ，输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中<math>f : \Re \mapsto \Re</math>~~为激活函数。在这里，我们选择~~<math>f(\cdot)</math>为S型函数：

+

【初译】神经元是一个计算单元，输入为<math>x_1, x_2, x_3</math> (a +1为截距项，注这里a为多余？校对者注：这里a是“一个”的意思，不是变量) ，输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中<math>f : \Re \mapsto \Re</math>为'''激活函数'''。在这里，我们选择<math>f(\cdot)</math>为S型函数：

Line 32:

-

【一审】这个“神经元”是一个以<math>x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，并输出<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>~~称为“激活函数”。在本课程中，我们的激活函数将选用Sigmoid函数：（一审注：因为tanh也是S型函数，所以以下函数不知如何命名）~~

+

【一审】这个“神经元”是一个以<math>x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，并输出<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的'''激活函数'''将选用Sigmoid函数：（一审注：因为tanh也是S型函数，所以以下函数不知如何命名）

Line 40:

-

【二审】这个“神经元”是一个以<math>x_1, x_2, x_3</math>及截距+1为输入值的运算单元，其输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>~~称为“激活函数”。在本课程中，我们的激活函数将选用Sigmoid函数：~~

+

【二审】这个“神经元”是一个以<math>x_1, x_2, x_3</math>及截距+1为输入值的运算单元，其输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的'''激活函数'''将选用Sigmoid函数：

Line 143:

the value +1. We also let <math>s_l</math> denote the number of nodes in layer <math>l</math> (not counting the bias unit).

-

【初译】标记<math>n_l</math>为网络的层数；这样<math>n_l=3</math>。标记层<math>l</math>为<math>L_l</math>，这样层<math>L_l</math>为输入层，层<math>L_{n_l}</math> 为输出层。神经网络参数，参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>（或权重）<math>W^{(l)}_{ij}</math>表示层<math>l</math>节点<math>j</math>和层<math>l+1</math>节点<math>i</math>之间连接关系（注意角标的顺序。）<math>b^{(l)}_i</math>~~表示层l~~+ ~~1节点i与偏置节点之间的连接关系。这样，在我们的例子中，和。注意到偏置节点没有输入和连接，所以输出总是为值~~+~~1。我们标记为层l的节点数目（不包含偏置节点数）。~~

+

【初译】标记<math>n_l</math>为网络的层数；这样<math>n_l=3</math>。标记层<math>l</math>为<math>L_l</math>，这样层<math>L_l</math>为输入层，层<math>L_{n_l}</math> 为输出层。神经网络参数，参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>（或权重）<math>W^{(l)}_{ij}</math>表示层<math>l</math>节点<math>j</math>和层<math>l+1</math>节点<math>i</math>之间连接关系（注意角标的顺序。）<math>b^{(l)}_i</math>表示层<math>l+1</math>节点<math>i</math>与偏置节点之间的连接关系。这样，在我们的例子中， <math>W^{(1)} \in \Re^{3\times 3}</math>, 和<math>W^{(2)} \in \Re^{1\times 3}</math>。注意到偏置节点没有输入和连接，所以输出总是为值+1。我们标记为<math>s_l</math>层<math>l</math>的节点数目（不包含偏置节点数）。

-

【一审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络有参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，这里第<math>l</math>层第<math>j</math>单元与第<math>l+1</math>层第<math>i</math>单元之间联接的参数就用<math>W^{(l)}_{ij}</math>来表示（注意标号顺序）。同样，<math>b^{(l)}_i</math>是第 + ~~1层第~~<math>i</math>单元的偏置项。(二审注:这里有点问题)~~因此，本例中，，。注意，偏置单元并没有输入值或与其它单元反向相联，这是因为它们总是只有一个输出值~~+~~1。同时，我们用sl表示第~~ 层的节点数（偏置单元不计在内）。

+

【一审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络有参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，这里第<math>l</math>层第<math>j</math>单元与第<math>l+1</math>层第<math>i</math>单元之间联接的参数就用<math>W^{(l)}_{ij}</math>来表示（注意标号顺序）。同样，<math>b^{(l)}_i</math>是第<math>l+1</math>层第<math>i</math>单元的偏置项。(二审注:这里有点问题)因此，本例中，<math>W^{(1)} \in \Re^{3\times 3}</math>，<math>W^{(2)} \in \Re^{1\times 3}</math>。注意，偏置单元并没有输入值或与其它单元反向相联，这是因为它们总是只有一个输出值+1。同时，我们用<math>s_l</math>表示第<math>l</math>层的节点数（偏置单元不计在内）。

-

【二审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络的参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，其中<math>W^{(l)}_{ij}</math>第<math>l</math>层第<math>j</math>号单元与第<math>l+1</math>层第<math>i</math>号单元之间联接的参数（注意标号顺序），<math>b^{(l)}_i</math>是第层的偏置项与 + ~~1层的第i号单元之间的参数。因此，本例中，，。注意，没有其他单元员连向偏置单元~~(也就是偏置单元没有输入)，因为它们总是输出+~~1。同时，我们用sl表示第~~ 层的节点数（偏置单元不计在内）。

+

【二审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络的参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，其中<math>W^{(l)}_{ij}</math>第<math>l</math>层第<math>j</math>号单元与第<math>l+1</math>层第<math>i</math>号单元之间联接的参数（注意标号顺序），<math>b^{(l)}_i</math>是第层的偏置项与<math>l+1</math>层的第<math>i</math>号单元之间的参数。因此，本例中，<math>W^{(1)} \in \Re^{3\times 3}</math>，<math>W^{(2)} \in \Re^{1\times 3}</math>。注意，没有其他单元员连向偏置单元(也就是偏置单元没有输入)，因为它们总是输出+1。同时，我们用<math>s_l</math>表示第<math>l</math>层的节点数（偏置单元不计在内）。

-

【原文】We will write to denote the activation (meaning output value) of unit i in layer . For = 1, we also use to denote the i-th input. Given a fixed setting of the parameters W,b, our neural network defines a hypothesis hW,b(x) that outputs a real number. Specifically, the computation that this neural network represents is given by:

+

【原文】We will write <math>a^{(l)}_i</math> to denote the '''activation''' (meaning output value) of unit <math>i</math> in layer <math>l</math>. For <math>l=1</math>, we also use <math>a^{(1)}_i = x_i</math> to denote the <math>i</math>-th input.Given a fixed setting of the parameters <math>W,b</math>, our neural

+

network defines a hypothesis <math>h_{W,b}(x)</math> that outputs a real number. Specifically, the computation that this neural network represents is given by:

+

:<math>

+

\begin{align}

+

a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)}) \\

+

a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)}) \\

+

a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)}) \\

+

h_{W,b}(x) &= a_1^{(3)} = f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})

+

\end{align}

+

</math>

-

~~【初译】用表示层l节点i激活（意味着有输出值）。对于l~~= ~~1时，我们用表示第i 个输入节点的输出值。给定参数W，b，神经网络就定义了一个假设函数hW~~,b(x)，其输出为真实值。下面明确的给出这个神经网络的计算公式：

+

【初译】用<math>a^{(l)}_i</math>表示层<math>l</math>节点<math>i</math>'''激活'''（意味着有输出值）。对于<math>l=1</math>时，我们用<math>a^{(1)}_i = x_i</math>表示第<math>i</math>个输入节点的输出值。给定参数<math>W,b</math>，神经网络就定义了一个假设函数<math>h_{W,b}(x)</math>，其输出为真实值。下面明确的给出这个神经网络的计算公式：

+

:<math>

+

\begin{align}

+

a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)}) \\

+

a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)}) \\

+

a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)}) \\

+

h_{W,b}(x) &= a_1^{(3)} = f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})

+

\end{align}

+

</math>

-

~~【一审】我们用表示第层第i单元的激活值（输出值）。当＝ 1时，同时也表示第i个单元的输入。对于给定参数集合W~~,~~b，我们的神经网络就以函数hW~~,b(x)计算输出结果。本例神经网络的计算过程就由以下步骤表示：

+

【一审】我们用<math>a^{(l)}_i</math>表示第<math>l</math>层第<math>i</math>单元的'''激活值'''（输出值）。当<math>l=1</math>时，同时也<math>a^{(1)}_i = x_i</math>表示第<math>i</math>个单元的输入。对于给定参数集合<math>W,b</math>，我们的神经网络就以函数<math>h_{W,b}(x)</math>计算输出结果。本例神经网络的计算过程就由以下步骤表示：

-

+

:<math>

-

~~【二审】我们用表示第~~ ~~层第i号单元的激活值（输出值）。当~~ ~~＝ 1时，，也就是第i个特征的输入值。对于给定参数集合W~~,~~b，我们的神经网络就按照函数hW~~,b(x)计算输出结果。本例神经网络的计算过程就由以下步骤表示：

+

\begin{align}

+

a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)}) \\

+

a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)}) \\

+

a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)}) \\

+

h_{W,b}(x) &= a_1^{(3)} = f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})

+

\end{align}

+

</math>

+

【二审】我们用<math>a^{(l)}_i</math>表示第<math>l</math>层第<math>i</math>号单元的'''激活值'''（输出值）。当<math>l=1</math>时，<math>a^{(1)}_i = x_i</math>，也就是第<math>i</math>个特征的输入值。对于给定参数集合<math>W,b</math>，我们的神经网络就按照函数<math>h_{W,b}(x)</math>计算输出结果。本例神经网络的计算过程就由以下步骤表示：

+

:<math>

+

\begin{align}

+

a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)}) \\

+

a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)}) \\

+

a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)}) \\

+

h_{W,b}(x) &= a_1^{(3)} = f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})

+

\end{align}

+

</math>

-

【原文】In the sequel, we also let denote the total weighted sum of inputs to unit i in layer , including the bias term (e.g., ), so that.

+

【原文】In the sequel, we also let <math>z^{(l)}_i</math> denote the total weighted sum of inputs to unit <math>i</math> in layer <math>l</math>,

+

including the bias term (e.g., <math>\textstyle z_i^{(2)} = \sum_{j=1}^n W^{(1)}_{ij} x_j + b^{(1)}_i</math>), so that

+

<math>a^{(l)}_i = f(z^{(l)}_i)</math>.

【初译】下面，我们用表示层l节点i的加权后输入量与偏置项之和（例如，），这样。

From Ufldl

Revision as of 06:19, 9 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 24: / Line 24: @@
-【初译】神经元是一个计算单元，输入为<math>x_1, x_2, x_3</math> (a +1为截距项，注这里a为多余？校对者注：这里a是“一个”的意思，不是变量) ，输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中<math>f : \Re \mapsto \Re</math>为激活函数。在这里，我们选择<math>f(\cdot)</math>为S型函数：
+【初译】神经元是一个计算单元，输入为<math>x_1, x_2, x_3</math> (a +1为截距项，注这里a为多余？校对者注：这里a是“一个”的意思，不是变量) ，输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中<math>f : \Re \mapsto \Re</math>为'''激活函数'''。在这里，我们选择<math>f(\cdot)</math>为S型函数：
@@ Line 32: / Line 32: @@
-【一审】这个“神经元”是一个以<math>x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，并输出<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的激活函数将选用Sigmoid函数：（一审注：因为tanh也是S型函数，所以以下函数不知如何命名）
+【一审】这个“神经元”是一个以<math>x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，并输出<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的'''激活函数'''将选用Sigmoid函数：（一审注：因为tanh也是S型函数，所以以下函数不知如何命名）
@@ Line 40: / Line 40: @@
-【二审】这个“神经元”是一个以<math>x_1, x_2, x_3</math>及截距+1为输入值的运算单元，其输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的激活函数将选用Sigmoid函数：
+【二审】这个“神经元”是一个以<math>x_1, x_2, x_3</math>及截距+1为输入值的运算单元，其输出为<math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math>，其中函数<math>f : \Re \mapsto \Re</math>称为“激活函数”。在本课程中，我们的'''激活函数'''将选用Sigmoid函数：
@@ Line 143: / Line 143: @@
 the value +1.  We also let <math>s_l</math> denote the number of nodes in layer <math>l</math> (not counting the bias unit).
-【初译】标记<math>n_l</math>为网络的层数；这样<math>n_l=3</math>。标记层<math>l</math>为<math>L_l</math>，这样层<math>L_l</math>为输入层，层<math>L_{n_l}</math> 为输出层。神经网络参数，参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>（或权重）<math>W^{(l)}_{ij}</math>表示层<math>l</math>节点<math>j</math>和层<math>l+1</math>节点<math>i</math>之间连接关系（注意角标的顺序。）<math>b^{(l)}_i</math>表示层l+ 1节点i与偏置节点之间的连接关系。这样，在我们的例子中，和。注意到偏置节点没有输入和连接，所以输出总是为值+1。我们标记为层l的节点数目（不包含偏置节点数）。
+【初译】标记<math>n_l</math>为网络的层数；这样<math>n_l=3</math>。标记层<math>l</math>为<math>L_l</math>，这样层<math>L_l</math>为输入层，层<math>L_{n_l}</math> 为输出层。神经网络参数，参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>（或权重）<math>W^{(l)}_{ij}</math>表示层<math>l</math>节点<math>j</math>和层<math>l+1</math>节点<math>i</math>之间连接关系（注意角标的顺序。）<math>b^{(l)}_i</math>表示层<math>l+1</math>节点<math>i</math>与偏置节点之间的连接关系。这样，在我们的例子中， <math>W^{(1)} \in \Re^{3\times 3}</math>, 和<math>W^{(2)} \in \Re^{1\times 3}</math>。注意到偏置节点没有输入和连接，所以输出总是为值+1。我们标记为<math>s_l</math>层<math>l</math>的节点数目（不包含偏置节点数）。
-【一审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络有参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，这里第<math>l</math>层第<math>j</math>单元与第<math>l+1</math>层第<math>i</math>单元之间联接的参数就用<math>W^{(l)}_{ij}</math>来表示（注意标号顺序）。同样，<math>b^{(l)}_i</math>是第 + 1层第<math>i</math>单元的偏置项。(二审注:这里有点问题)因此，本例中，，。注意，偏置单元并没有输入值或与其它单元反向相联，这是因为它们总是只有一个输出值+1。同时，我们用sl表示第 层的节点数（偏置单元不计在内）。
+【一审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络有参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，这里第<math>l</math>层第<math>j</math>单元与第<math>l+1</math>层第<math>i</math>单元之间联接的参数就用<math>W^{(l)}_{ij}</math>来表示（注意标号顺序）。同样，<math>b^{(l)}_i</math>是第<math>l+1</math>层第<math>i</math>单元的偏置项。(二审注:这里有点问题)因此，本例中，<math>W^{(1)} \in \Re^{3\times 3}</math>，<math>W^{(2)} \in \Re^{1\times 3}</math>。注意，偏置单元并没有输入值或与其它单元反向相联，这是因为它们总是只有一个输出值+1。同时，我们用<math>s_l</math>表示第<math>l</math>层的节点数（偏置单元不计在内）。
-【二审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络的参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，其中<math>W^{(l)}_{ij}</math>第<math>l</math>层第<math>j</math>号单元与第<math>l+1</math>层第<math>i</math>号单元之间联接的参数（注意标号顺序），<math>b^{(l)}_i</math>是第层的偏置项与 + 1层的第i号单元之间的参数。因此，本例中，，。注意，没有其他单元员连向偏置单元(也就是偏置单元没有输入)，因为它们总是输出+1。同时，我们用sl表示第 层的节点数（偏置单元不计在内）。
+【二审】我们用<math>n_l</math>表示网络的层数，因此，本例中<math>n_l=3</math>，将第<math>l</math>层记为<math>L_l</math>，于是<math>L_l</math>就是输入层，输出层记为<math>L_{n_l}</math>。本例神经网络的参数<math>(W,b) = (W^{(1)}, b^{(1)}, W^{(2)}, b^{(2)})</math>，其中<math>W^{(l)}_{ij}</math>第<math>l</math>层第<math>j</math>号单元与第<math>l+1</math>层第<math>i</math>号单元之间联接的参数（注意标号顺序），<math>b^{(l)}_i</math>是第层的偏置项与<math>l+1</math>层的第<math>i</math>号单元之间的参数。因此，本例中，<math>W^{(1)} \in \Re^{3\times 3}</math>，<math>W^{(2)} \in \Re^{1\times 3}</math>。注意，没有其他单元员连向偏置单元(也就是偏置单元没有输入)，因为它们总是输出+1。同时，我们用<math>s_l</math>表示第<math>l</math>层的节点数（偏置单元不计在内）。
-【原文】We will write to denote the activation (meaning output value) of unit i in layer . For = 1, we also use to denote the i-th input. Given a fixed setting of the parameters W,b, our neural network defines a hypothesis hW,b(x) that outputs a real number. Specifically, the computation that this neural network represents is given by:
+【原文】We will write <math>a^{(l)}_i</math> to denote the '''activation''' (meaning output value) of unit <math>i</math> in layer <math>l</math>.  For <math>l=1</math>, we also use <math>a^{(1)}_i = x_i</math> to denote the <math>i</math>-th input.Given a fixed setting of the parameters <math>W,b</math>, our neural
+network defines a hypothesis <math>h_{W,b}(x)</math> that outputs a real number.  Specifically, the computation that this neural network represents is given by:
+:<math>
+\begin{align}
+a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)})  \\
+a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)})  \\
+a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)})  \\
+h_{W,b}(x) &= a_1^{(3)} =  f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})
+\end{align}
+</math>
-【初译】用表示层l节点i激活（意味着有输出值）。对于l= 1时，我们用表示第i 个输入节点的输出值。给定参数W，b，神经网络就定义了一个假设函数hW,b(x)，其输出为真实值。下面明确的给出这个神经网络的计算公式：
+【初译】用<math>a^{(l)}_i</math>表示层<math>l</math>节点<math>i</math>'''激活'''（意味着有输出值）。对于<math>l=1</math>时，我们用<math>a^{(1)}_i = x_i</math>表示第<math>i</math>个输入节点的输出值。给定参数<math>W,b</math>，神经网络就定义了一个假设函数<math>h_{W,b}(x)</math>，其输出为真实值。下面明确的给出这个神经网络的计算公式：
+:<math>
+\begin{align}
+a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)})  \\
+a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)})  \\
+a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)})  \\
+h_{W,b}(x) &= a_1^{(3)} =  f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})
+\end{align}
+</math>
-【一审】我们用表示第  层第i单元的激活值（输出值）。当  ＝ 1时，同时也表示第i个单元的输入。对于给定参数集合W,b，我们的神经网络就以函数hW,b(x)计算输出结果。本例神经网络的计算过程就由以下步骤表示：
+【一审】我们用<math>a^{(l)}_i</math>表示第<math>l</math>层第<math>i</math>单元的'''激活值'''（输出值）。当<math>l=1</math>时，同时也<math>a^{(1)}_i = x_i</math>表示第<math>i</math>个单元的输入。对于给定参数集合<math>W,b</math>，我们的神经网络就以函数<math>h_{W,b}(x)</math>计算输出结果。本例神经网络的计算过程就由以下步骤表示：
+:<math>
-【二审】我们用表示第  层第i号单元的激活值（输出值）。当  ＝ 1时，，也就是第i个特征的输入值。对于给定参数集合W,b，我们的神经网络就按照函数hW,b(x)计算输出结果。本例神经网络的计算过程就由以下步骤表示：
+\begin{align}
+a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)})  \\
+a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)})  \\
+a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)})  \\
+h_{W,b}(x) &= a_1^{(3)} =  f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})
+\end{align}
+</math>
+【二审】我们用<math>a^{(l)}_i</math>表示第<math>l</math>层第<math>i</math>号单元的'''激活值'''（输出值）。当<math>l=1</math>时，<math>a^{(1)}_i = x_i</math>，也就是第<math>i</math>个特征的输入值。对于给定参数集合<math>W,b</math>，我们的神经网络就按照函数<math>h_{W,b}(x)</math>计算输出结果。本例神经网络的计算过程就由以下步骤表示：
+:<math>
+\begin{align}
+a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)})  \\
+a_2^{(2)} &= f(W_{21}^{(1)}x_1 + W_{22}^{(1)} x_2 + W_{23}^{(1)} x_3 + b_2^{(1)})  \\
+a_3^{(2)} &= f(W_{31}^{(1)}x_1 + W_{32}^{(1)} x_2 + W_{33}^{(1)} x_3 + b_3^{(1)})  \\
+h_{W,b}(x) &= a_1^{(3)} =  f(W_{11}^{(2)}a_1^{(2)} + W_{12}^{(2)} a_2^{(2)} + W_{13}^{(2)} a_3^{(2)} + b_1^{(2)})
+\end{align}
+</math>
-【原文】In the sequel, we also let denote the total weighted sum of inputs to unit i in layer , including the bias term (e.g., ), so that.
+【原文】In the sequel, we also let <math>z^{(l)}_i</math> denote the total weighted sum of inputs to unit <math>i</math> in layer <math>l</math>,
+including the bias term (e.g., <math>\textstyle z_i^{(2)} = \sum_{j=1}^n W^{(1)}_{ij} x_j + b^{(1)}_i</math>), so that
+<math>a^{(l)}_i = f(z^{(l)}_i)</math>.
 【初译】下面，我们用表示层l节点i的加权后输入量与偏置项之和（例如，），这样。