Deep Networks: Overview
From Ufldl
(→Greedy layer-wise training) |
|||
Line 71: | Line 71: | ||
The main learning algorithm that researchers were using was to randomly initialize | The main learning algorithm that researchers were using was to randomly initialize | ||
the weights of a deep network, and then train it using a labeled | the weights of a deep network, and then train it using a labeled | ||
- | training set <math>\{ (x^{(1)}_l, y^{(1}), \ldots, (x^{(m_l)}_l, y^{(m_l)}) \}</math> | + | training set <math>\{ (x^{(1)}_l, y^{(1)}), \ldots, (x^{(m_l)}_l, y^{(m_l)}) \}</math> |
using a supervised learning objective, for example by applying gradient descent to try to | using a supervised learning objective, for example by applying gradient descent to try to | ||
drive down the training error. However, this usually did not work well. | drive down the training error. However, this usually did not work well. | ||
Line 175: | Line 175: | ||
+ | {{CNN}} | ||
<!-- | <!-- | ||
Line 190: | Line 191: | ||
[http://jmlr.csail.mit.edu/proceedings/papers/v9/erhan10a/erhan10a.pdf] | [http://jmlr.csail.mit.edu/proceedings/papers/v9/erhan10a/erhan10a.pdf] | ||
!--> | !--> | ||
+ | |||
+ | |||
+ | {{Languages|深度网络概览|中文}} |