神经网络

Revision as of 03:09, 14 March 2013 (view source)

Kandeng (Talk | contribs)

← Older edit

Revision as of 03:12, 14 March 2013 (view source)

Kandeng (Talk | contribs)

Newer edit →

Line 1:

-

举一个监督学习的例子，假设我们有训练样本集 <math>\textstyle (x(^ i),y(^ i))</math> ，那么神经网络算法能够提供一种复杂且非线性的假设模型 <math>\textstyle h_{W,b}(x)</math> ，它具有参数 <math>\textstyle W, b</math> ，可以以此参数来拟合我们的数据。

Line 9:

Line 8:

这个“神经元”是一个以 <math>\textstyle x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，其输出为 <math>\textstyle h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math> ，其中函数 <math>\textstyle f : \Re \mapsto \Re</math> 被称为“激活函数”。在本教程中，我们选用sigmoid函数

作为'''激活函数''' <math>\textstyle f(\cdot)</math>

-

: <math>

+

:<math>

f(z) = \frac{1}{1+\exp(-z)}.

</math>

Line 17:

Line 16:

虽然本系列教程采用sigmoid函数，但你也可以选择双曲正切函数（tanh）：

-

: <math>

+

:<math>

f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},

</math>

Line 28:

Line 27:

</div>

-

<math>\textstyle \tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为 <math>\textstyle [-1,1]</math> ，而不是sigmoid函数的 <math>\textstyle [0,1]</math> 。

+

<math>\textstyle \tanh(z)</math> 函数是sigmoid函数的一种变体，它的取值范围为 <math>\textstyle [-1,1]</math> ，而不是sigmoid函数的 <math>\textstyle [0,1]</math> 。

注意，与其它地方（包括OpenClassroom公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令 <math>\textstyle x_0=1</math> 。取而代之，我们用单独的参数 <math>\textstyle b</math> 来表示截距。

Line 45:

Line 44:

我们用 <math>\textstyle a^{(l)}_i</math> 表示第 <math>\textstyle l</math> 层第 <math>\textstyle i</math> 号单元的'''激活值'''（输出值）。当 <math>\textstyle l=1</math> 时， <math>\textstyle a^{(1)}_i = x_i</math> ，也就是第 <math>\textstyle i</math> 个输入值（输入值的第 <math>\textstyle i</math> 个特征）。对于给定参数集合 <math>\textstyle W,b</math> ，我们的神经网络就按照函数 <math>\textstyle h_{W,b}(x)</math> 计算输出结果。本例神经网络的计算过程就由以下步骤表示：

-

: <math>

+

:<math>

\begin{align}

a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)}) \\

Line 57:

Line 56:

这样我们就可以得到一种更简洁的表示法。这里我们将激活函数 <math>\textstyle f(\cdot)</math> 扩展为用向量（分量的形式）来表示，即 <math>\textstyle f([z_1, z_2, z_3]) = [f(z_1), f(z_2), f(z_3)]</math> ，那么，上面的等式可以更简洁地表示为：

-

: <math>~~\textstyle~~ \begin{align}

+

:<math>\begin{align}

z^{(2)} &= W^{(1)} x + b^{(1)} \\

a^{(2)} &= f(z^{(2)}) \\

Line 66:

我们将上面的计算步骤叫作'''正向传播'''。回想一下，之前我们用 <math>\textstyle a^{(1)} = x</math> 表示输入层的激活值，那么给定第 <math>\textstyle l</math> 层的激活值 <math>\textstyle a^{(l)}</math> 后，第 <math>\textstyle l+1</math> 层的激活值 <math>\textstyle a^{(l+1)}</math> 就可以按照下面步骤计算得到：

-

: <math> \begin{align}

+

:<math> \begin{align}

z^{(l+1)} &= W^{(l)} a^{(l)} + b^{(l)} \\

a^{(l+1)} &= f(z^{(l+1)})

Line 85:

neural networks 神经网络

-

activation function. 激活函数

+

activation function 激活函数

hyperbolic tangent 双曲正切函数

Line 91:

bias units 偏置项

-

~~activation激活值~~

+

activation 激活值

forward propagation 正向传播(这里为了与“反向传播”的翻译相对应，采用“正向传播”)

From Ufldl

Revision as of 03:12, 14 March 2013

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 1: / Line 1: @@
 举一个监督学习的例子，假设我们有训练样本集  <math>\textstyle (x(^ i),y(^ i))</math> ，那么神经网络算法能够提供一种复杂且非线性的假设模型 <math>\textstyle h_{W,b}(x)</math> ，它具有参数 <math>\textstyle W, b</math> ，可以以此参数来拟合我们的数据。
@@ Line 9: / Line 8: @@
 这个“神经元”是一个以 <math>\textstyle x_1, x_2, x_3</math> 及截距+1为输入值的运算单元，其输出为 <math>\textstyle  h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)</math> ，其中函数 <math>\textstyle f : \Re \mapsto \Re</math> 被称为“激活函数”。在本教程中，我们选用sigmoid函数
 作为'''激活函数''' <math>\textstyle f(\cdot)</math>
-: <math>
+:<math>
 f(z) = \frac{1}{1+\exp(-z)}.
 </math>
@@ Line 17: / Line 16: @@
 虽然本系列教程采用sigmoid函数，但你也可以选择双曲正切函数（tanh）：
-: <math>
+:<math>
 f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},
 </math>
@@ Line 28: / Line 27: @@
 </div>
- <math>\textstyle \tanh(z)</math>  函数是sigmoid函数的一种变体，它的取值范围为 <math>\textstyle [-1,1]</math> ，而不是sigmoid函数的 <math>\textstyle [0,1]</math> 。
+<math>\textstyle \tanh(z)</math>  函数是sigmoid函数的一种变体，它的取值范围为 <math>\textstyle [-1,1]</math> ，而不是sigmoid函数的 <math>\textstyle [0,1]</math> 。
 注意，与其它地方（包括OpenClassroom公开课以及斯坦福大学CS229课程）不同的是，这里我们不再令 <math>\textstyle x_0=1</math> 。取而代之，我们用单独的参数 <math>\textstyle b</math> 来表示截距。
@@ Line 45: / Line 44: @@
 我们用 <math>\textstyle a^{(l)}_i</math> 表示第 <math>\textstyle l</math> 层第 <math>\textstyle i</math> 号单元的'''激活值'''（输出值）。当 <math>\textstyle l=1</math> 时， <math>\textstyle a^{(1)}_i = x_i</math> ，也就是第 <math>\textstyle i</math> 个输入值（输入值的第 <math>\textstyle i</math> 个特征）。对于给定参数集合 <math>\textstyle W,b</math> ，我们的神经网络就按照函数 <math>\textstyle h_{W,b}(x)</math> 计算输出结果。本例神经网络的计算过程就由以下步骤表示：
-: <math>
+:<math>
 \begin{align}
 a_1^{(2)} &= f(W_{11}^{(1)}x_1 + W_{12}^{(1)} x_2 + W_{13}^{(1)} x_3 + b_1^{(1)})  \\
@@ Line 57: / Line 56: @@
 这样我们就可以得到一种更简洁的表示法。这里我们将激活函数 <math>\textstyle f(\cdot)</math> 扩展为用向量（分量的形式）来表示，即 <math>\textstyle f([z_1, z_2, z_3]) = [f(z_1), f(z_2), f(z_3)]</math> ，那么，上面的等式可以更简洁地表示为：
-: <math>\textstyle \begin{align}
+:<math>\begin{align}
 z^{(2)} &= W^{(1)} x + b^{(1)} \\
 a^{(2)} &= f(z^{(2)}) \\
@@ Line 66: / Line 66: @@
 我们将上面的计算步骤叫作'''正向传播'''。回想一下，之前我们用 <math>\textstyle a^{(1)} = x</math>  表示输入层的激活值，那么给定第 <math>\textstyle l</math> 层的激活值 <math>\textstyle a^{(l)}</math> 后，第 <math>\textstyle l+1</math> 层的激活值 <math>\textstyle a^{(l+1)}</math> 就可以按照下面步骤计算得到：
-: <math> \begin{align}
+:<math> \begin{align}
 z^{(l+1)} &= W^{(l)} a^{(l)} + b^{(l)}   \\
 a^{(l+1)} &= f(z^{(l+1)})
@@ Line 85: / Line 85: @@
 neural networks 神经网络
-activation function. 激活函数
+activation function 激活函数
 hyperbolic tangent 双曲正切函数
@@ Line 91: / Line 91: @@
 bias units 偏置项
-activation激活值
+activation 激活值
 forward propagation 正向传播(这里为了与“反向传播”的翻译相对应，采用“正向传播”)