Exercise:Softmax Regression

From Ufldl

Jump to: navigation, search
Line 5: Line 5:
In the file <tt>softmax_exercise.zip</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files. The only file you need to modify for this exercise is is <tt>softmaxCost.m</tt>.
In the file <tt>softmax_exercise.zip</tt>, we have provided some starter code. You should write your code in the places indicated by "YOUR CODE HERE" in the files. The only file you need to modify for this exercise is is <tt>softmaxCost.m</tt>.
-
=== Step 0: Initialise constants ===
+
=== Step 0: Initialise constants and parameters ===
-
Two constants, inputSize and outputSize, corresponding to the size of each input vector and the number of class labels have been defined in the starter code. This will allow you to reuse your code on a different data set in a later exercise.
+
Two constants, <tt>inputSize</tt> and <tt>outputSize</tt>, corresponding to the size of each input vector and the number of class labels have been defined in the starter code. This will allow you to reuse your code on a different data set in a later exercise. We also initialise <tt>lambda</tt>, the weight decay parameter here.
=== Step 1: Load data ===
=== Step 1: Load data ===
Line 15: Line 15:
=== Step 2: Implement softmaxCost ===
=== Step 2: Implement softmaxCost ===
-
In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc minimises this cost, we set it to the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximise <math>\ell(\theta)</math>. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).
+
In softmaxCost.m, implement code to compute the softmax cost function. Since minFunc minimises this cost, we consider the '''negative''' of the log-likelihood <math>-\ell(\theta)</math>, in order to maximise <math>\ell(\theta)</math>. Remember also to include the weight decay term in the cost as well. Your code should also compute the appropriate gradients, as well as the predictions for the input data (which will be used in the cross-validation step later).
'''Implementation tip: computing the output matrix''' - in your code, you may need to compute the output matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the output matrix has already been included in softmaxCost.m.
'''Implementation tip: computing the output matrix''' - in your code, you may need to compute the output matrix <tt>M</tt>, such that <tt>M(r, c)</tt> is 1 if <math>y^{(c)} = r</math> and 0 otherwise. This can be done quickly, without a loop, using the MATLAB functions <tt>sparse</tt> and <tt>full</tt>. <tt>sparse(r, c, v)</tt> creates a sparse matrix such that <tt>M(r(i), c(i)) = v(i)</tt> for all i. That is, the vectors <tt>r</tt> and <tt>c</tt> give the position of the elements whose values we wish to set, and <tt>v</tt> the corresponding values of the elements. Running <tt>full</tt> on a sparse matrix gives the full representation of the matrix for use. Note that the code for using <tt>sparse</tt> and <tt>full</tt> to compute the output matrix has already been included in softmaxCost.m.

Revision as of 05:05, 13 April 2011

Personal tools