# Neural Networks

(Difference between revisions)
Jump to: navigation, search
 Revision as of 05:29, 26 February 2011 (view source)Ang (Talk | contribs) (Created page with "Consider a supervised learning problem where we have access to labeled training examples $(x^{(i)}, y^{(i)})$. Neural networks give a way of defining a complex, non-l...")← Older edit Revision as of 05:36, 26 February 2011 (view source)Ang (Talk | contribs) Newer edit → Line 8: Line 8: diagram to denote a single neuron: diagram to denote a single neuron: - INSERTGRAPHICSHERE + [[Image:SingleNeuron.png|400px|center]] - This `neuron' is a computational unit that takes as input $x_1, x_2, x_3$ (and a +1 intercept term), and + This "neuron" is a computational unit that takes as input $x_1, x_2, x_3$ (and a +1 intercept term), and outputs $h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)$, where $f : \Re \mapsto \Re$ is outputs $h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)$, where $f : \Re \mapsto \Re$ is called the '''activation function'''.  In these notes, we will choose called the '''activation function'''.  In these notes, we will choose

## Revision as of 05:36, 26 February 2011

Consider a supervised learning problem where we have access to labeled training examples (x(i),y(i)). Neural networks give a way of defining a complex, non-linear form of hypotheses hW,b(x), with parameters W,b that we can fit to our data.

To describe neural networks, we will begin by describing the simplest possible neural network, one which comprises a single "neuron." We will use the following diagram to denote a single neuron:

This "neuron" is a computational unit that takes as input x1,x2,x3 (and a +1 intercept term), and outputs $h_{W,b}(x) = f(W^Tx) = f(\sum_{i=1}^3 W_{i}x_i +b)$, where $f : \Re \mapsto \Re$ is called the activation function. In these notes, we will choose $f(\cdot)$ to be the sigmoid function:

$f(z) = \frac{1}{1+\exp(-z)}.$

Thus, our single neuron corresponds exactly to the input-output mapping defined by logistic regression.

Although these notes will use the sigmoid function, it is worth noting that another common choice for f is the hyperbolic tangent, or tanh, function:

$f(z) = \tanh(z) = \frac{e^z - e^{-z}}{e^z + e^{-z}},$

Here are plots of the sigmoid and tanh functions: