Logistic Regression Vectorization Example

@@ Line 7: / Line 7: @@
 and <math>\textstyle \theta \in \Re^{n+1}</math>, and <math>\textstyle \theta_0</math> is our intercept term.  We have a training set
 <math>\textstyle \{(x^{(1)}, y^{(1)}), \ldots, (x^{(m)}, y^{(m)})\}</math> of <math>\textstyle m</math> examples, and the batch gradient
-ascent update rule is "<math>\textstyle \theta := \theta + \alpha \nabla_\theta \ell(\theta)</math>, where <math>\textstyle \ell(\theta)</math>
+ascent update rule is <math>\textstyle \theta := \theta + \alpha \nabla_\theta \ell(\theta)</math>, where <math>\textstyle \ell(\theta)</math>
 is the log likelihood and <math>\textstyle \nabla_\theta \ell(\theta)</math> is its derivative.
@@ Line 56: / Line 56: @@
 % Fast implementation of matrix-vector multiple
-grad = A*b
+grad = A*b;
 </syntaxhighlight>
@@ Line 63: / Line 63: @@
 <syntaxhighlight lang="matlab">
 % Implementation 3
-grad = x * (y- sigmoid(theta'*x))'
+grad = x * (y- sigmoid(theta'*x))';
 </syntaxhighlight>
 Here, we assume that the Matlab/Octave <tt>sigmoid(z)</tt> takes as input a vector <tt>z</tt>, applies the sigmoid function component-wise to the input, and returns the result.  The output of <tt>sigmoid(z)</tt> is therefore itself also a vector, of the same dimension as the input <tt>z</tt>

Revision as of 00:53, 23 April 2011