Softmax Regression

From Ufldl

Jump to: navigation, search
(Introduction)
(Weight Regularization)
Line 82: Line 82:
=== Weight Regularization ===
=== Weight Regularization ===
-
When using softmax regression in practice, it is important to use weight regularization. In particular, if there exist a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.  
+
When using softmax regression in practice, it is important to use weight regularization. In particular, if there exists a linear separator that perfectly classifies all the data points, then the softmax-objective is unbounded (given any <math>\theta</math> that separates the data perfectly, one can always scale <math>\theta</math> to be larger and obtain a better objective value). With weight regularization, one penalizes the weights for being large and thus avoids these degenerate situations.  
Weight regularization is also important as it often results in models that generalize better. In particular, one can view weight regularization as placing a (Gaussian) prior on <math>\theta</math> so as to prefer <math>\theta</math> with smaller values.  
Weight regularization is also important as it often results in models that generalize better. In particular, one can view weight regularization as placing a (Gaussian) prior on <math>\theta</math> so as to prefer <math>\theta</math> with smaller values.  
Line 112: Line 112:
Minimizing <math>J(\theta)</math> now performs regularized softmax regression.
Minimizing <math>J(\theta)</math> now performs regularized softmax regression.
-
 
== Parameterization ==
== Parameterization ==

Revision as of 07:22, 4 May 2011

Personal tools