Softmax Regression

From Ufldl

Jump to: navigation, search
(Cost Function)
(Cost Function)
Line 162: Line 162:
<math>
<math>
\begin{align}
\begin{align}
-
\nabla_{\theta_j} J(\theta) = - \frac{1}{m} \sum_{i=1}^{m}{ \left[ x^{(i)} ( 1\{ y^{(i)} = j\}  - p(y^{(i)} = j | x^{(i)}; \theta) ) \right]  }
+
\nabla_{\theta_j} J(\theta) = - \frac{1}{m} \sum_{i=1}^{m}{ \left[ x^{(i)} \left( 1\{ y^{(i)} = j\}  - p(y^{(i)} = j | x^{(i)}; \theta) \right) \right]  }
\end{align}
\end{align}
</math>
</math>
Line 170: Line 170:
<math>p(y^{(i)} = j | x^{(i)} ; \theta) = e^{\theta_j^T x^{(i)}}/(\sum_{l=1}^k e^{ \theta_l^T x^{(i)}} )</math>.  !-->
<math>p(y^{(i)} = j | x^{(i)} ; \theta) = e^{\theta_j^T x^{(i)}}/(\sum_{l=1}^k e^{ \theta_l^T x^{(i)}} )</math>.  !-->
-
Recall the meaning "<math>\nabla_{\theta_j}</math>" notation.  In particular, <math>\nabla_{\theta_j} J(\theta)</math>
+
Recall the meaning of the "<math>\nabla_{\theta_j}</math>" notation.  In particular, <math>\nabla_{\theta_j} J(\theta)</math>
-
is itself a vector, so that it's <math>l</math>-th element is <math>\frac{\partial J(\theta)}{\partial \theta_{jl}}</math>
+
is itself a vector, so that its <math>l</math>-th element is <math>\frac{\partial J(\theta)}{\partial \theta_{jl}}</math>
-
the partial derivative of <math>J(\theta)</math> with respect to the <math>l</math>-th element of <math>\theta_l</math>.  
+
the partial derivative of <math>J(\theta)</math> with respect to the <math>l</math>-th element of <math>\theta_j</math>.  
Armed with this formula for the derivative, one can then plug it into an algorithm such as gradient descent, and have it
Armed with this formula for the derivative, one can then plug it into an algorithm such as gradient descent, and have it

Revision as of 18:36, 10 May 2011

Personal tools