Softmax Regression

From Ufldl

Jump to: navigation, search
(Created page with "== Introduction == '''Softmax regression''', also known as '''multinomial logistic regression''', is a generalisation of logistic regression to problems where there are more tha...")
Line 27: Line 27:
\end{bmatrix}
\end{bmatrix}
\end{align}
\end{align}
-
</math>,
+
</math>
where <math>\theta_1, \theta_2, \ldots, \theta_n</math> are each <math>k</math>-dimensional column vectors that constitute the parameters of our hypothesis.
where <math>\theta_1, \theta_2, \ldots, \theta_n</math> are each <math>k</math>-dimensional column vectors that constitute the parameters of our hypothesis.
Line 42: Line 42:
&= \ln \theta^T_{y^{(i)}} x^{(i)} - \ln \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }}
&= \ln \theta^T_{y^{(i)}} x^{(i)} - \ln \sum_{j=1}^{n}{e^{ \theta_j^T x^{(i)} }}
\end{align}
\end{align}
-
</math>.
+
</math>
-
Differentiating with respect to <math>\theta_{k}</math>,
+
To find <math>\theta</math> such that <math>\ell(\theta)</math> is maximised, we first find the derivatives of <math>\ell(\theta)</math> with respect to <math>\theta_{k}</math>:
<math>
<math>
Line 52: Line 52:
&= I_{ \{ y^{(i)} = k\} } x^{(i)} - P(y^{(i)} = k | x^{(i)})
&= I_{ \{ y^{(i)} = k\} } x^{(i)} - P(y^{(i)} = k | x^{(i)})
\end{align}
\end{align}
-
</math>.
+
</math>
 +
 
 +
With this, we can now find a set of parameters that maximises <math>\ell(\theta)</math>, for instance by using gradient ascent.

Revision as of 22:52, 10 April 2011

Personal tools