Softmax Regression
From Ufldl
(→Softmax Regression vs. k Binary Classifiers) |
|||
Line 73: | Line 73: | ||
For convenience, we will also write | For convenience, we will also write | ||
<math>\theta</math> to denote all the | <math>\theta</math> to denote all the | ||
- | parameters of our model. When you implement softmax regression, | + | parameters of our model. When you implement softmax regression, it is usually |
convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by | convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by | ||
stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that | stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that | ||
Line 202: | Line 202: | ||
regression's parameters are "redundant." More formally, we say that our | regression's parameters are "redundant." More formally, we say that our | ||
softmax model is '''overparameterized,''' meaning that for any hypothesis we might | softmax model is '''overparameterized,''' meaning that for any hypothesis we might | ||
- | fit to the data, there | + | fit to the data, there are multiple parameter settings that give rise to exactly |
the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math> | the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math> | ||
to the predictions. | to the predictions. | ||
Line 385: | Line 385: | ||
classifier would be appropriate. In the second case, it would be more appropriate to build | classifier would be appropriate. In the second case, it would be more appropriate to build | ||
three separate logistic regression classifiers. | three separate logistic regression classifiers. | ||
+ | |||
+ | |||
+ | {{Softmax}} | ||
+ | |||
+ | |||
+ | {{Languages|Softmax回归|中文}} |