Softmax Regression

From Ufldl

Jump to: navigation, search
 
Line 73: Line 73:
For convenience, we will also write  
For convenience, we will also write  
<math>\theta</math> to denote all the
<math>\theta</math> to denote all the
-
parameters of our model.  When you implement softmax regression, is usually
+
parameters of our model.  When you implement softmax regression, it is usually
convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by
convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by
stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that
stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that
Line 202: Line 202:
regression's parameters are "redundant."  More formally, we say that our
regression's parameters are "redundant."  More formally, we say that our
softmax model is '''overparameterized,''' meaning that for any hypothesis we might
softmax model is '''overparameterized,''' meaning that for any hypothesis we might
-
fit to the data, there're multiple parameter settings that give rise to exactly
+
fit to the data, there are multiple parameter settings that give rise to exactly
the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math>
the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math>
to the predictions.  
to the predictions.  
Line 385: Line 385:
classifier would be appropriate.  In the second case, it would be more appropriate to build
classifier would be appropriate.  In the second case, it would be more appropriate to build
three separate logistic regression classifiers.
three separate logistic regression classifiers.
 +
 +
 +
{{Softmax}}
 +
 +
 +
{{Languages|Softmax回归|中文}}

Latest revision as of 13:24, 7 April 2013

Personal tools