Softmax Regression
From Ufldl
(→Relationship to Logistic Regression) |
|||
Line 73: | Line 73: | ||
For convenience, we will also write | For convenience, we will also write | ||
<math>\theta</math> to denote all the | <math>\theta</math> to denote all the | ||
- | parameters of our model. When you implement softmax regression, | + | parameters of our model. When you implement softmax regression, it is usually |
convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by | convenient to represent <math>\theta</math> as a <math>k</math>-by-<math>(n+1)</math> matrix obtained by | ||
stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that | stacking up <math>\theta_1, \theta_2, \ldots, \theta_k</math> in rows, so that | ||
Line 202: | Line 202: | ||
regression's parameters are "redundant." More formally, we say that our | regression's parameters are "redundant." More formally, we say that our | ||
softmax model is '''overparameterized,''' meaning that for any hypothesis we might | softmax model is '''overparameterized,''' meaning that for any hypothesis we might | ||
- | fit to the data, there | + | fit to the data, there are multiple parameter settings that give rise to exactly |
the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math> | the same hypothesis function <math>h_\theta</math> mapping from inputs <math>x</math> | ||
to the predictions. | to the predictions. | ||
Line 380: | Line 380: | ||
or three logistic regression classifiers? (ii) Now suppose your classes are | or three logistic regression classifiers? (ii) Now suppose your classes are | ||
indoor_scene, black_and_white_image, and image_has_people. Would you use softmax | indoor_scene, black_and_white_image, and image_has_people. Would you use softmax | ||
- | regression | + | regression or multiple logistic regression classifiers? |
In the first case, the classes are mutually exclusive, so a softmax regression | In the first case, the classes are mutually exclusive, so a softmax regression | ||
classifier would be appropriate. In the second case, it would be more appropriate to build | classifier would be appropriate. In the second case, it would be more appropriate to build | ||
three separate logistic regression classifiers. | three separate logistic regression classifiers. | ||
+ | |||
+ | |||
+ | {{Softmax}} | ||
+ | |||
+ | |||
+ | {{Languages|Softmax回归|中文}} |