UFLDL Recommended Readings

Revision as of 17:45, 10 March 2011 (view source)

Latest revision as of 07:00, 18 February 2012 (view source)

Line 13:

* [http://www.cs.toronto.edu/~hinton/science.pdf] Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 2006.

** If you want to play with the code, you can also find it at [http://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html].

-

* [http://~~www-etud~~.~~iro~~.~~umontreal.ca~~/~~~larocheh~~/~~publications~~/~~greedy-deep-nets-nips-06~~.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006

+

* [http://books.nips.cc/papers/files/nips19/NIPS2006_0739.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006

* [http://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf] Pascal Vincent, Hugo Larochelle, Yoshua Bengio and Pierre-Antoine Manzagol. Extracting and Composing Robust Features with Denoising Autoencoders. ICML 2008.

-

** (They have a nice model, but then backwards rationalize it into a probabilistic model. Ignore the backwards rationalized probabilistic model~~.) (Someone please clarify eactly which section of the paper this is~~.)

+

** (They have a nice model, but then backwards rationalize it into a probabilistic model. Ignore the backwards rationalized probabilistic model [Section 4].)

Analyzing deep learning/why does deep learning work:

* [http://www.cs.toronto.edu/~larocheh/publications/deep-nets-icml-07.pdf] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. ICML 2007.

-

** (Someone read this and let us know if this is worth keeping,.)

+

** (Someone read this and let us know if this is worth keeping,. [Most model related material already covered by other papers, it seems not many impactful conclusions can be made from results, but can serve as reading for reinforcement for deep models])

* [http://www.jmlr.org/papers/volume11/erhan10a/erhan10a.pdf] Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. Why Does Unsupervised Pre-training Help Deep Learning? JMLR 2010

* [http://cs.stanford.edu/~ang/papers/nips09-MeasuringInvariancesDeepNetworks.pdf] Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng. Measuring invariances in deep networks. NIPS 2009.

Line 28:

* [http://deeplearning.net/tutorial/rbm.html] Tutorial on RBMs.

** But ignore the Theano code examples.

-

** (Someone tell us if this should be moved later. Useful for understanding some of DL literature, but not needed for many of the later papers?)

+

** (Someone tell us if this should be moved later. Useful for understanding some of DL literature, but not needed for many of the later papers? [Seems ok to leave in, useful introduction if reader had no idea about RBM's, and have to deal with Hinton's 06 Science paper or 3-way RBM's right away])

Line 47:

* [http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/57] Yoshua Bengio, Réjean Ducharme, Pascal Vincent and Christian Jauvin, A Neural Probabilistic Language Model. JMLR 2003.

* [http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf] R. Collobert and J. Weston. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. ICML 2008.

+

* [http://www.socher.org/uploads/Main/SocherPenningtonHuangNgManning_EMNLP2011.pdf] Richard Socher, Jeffrey Pennington, Eric Huang, Andrew Y. Ng, and Christopher D. Manning. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions. EMNLP 2011

+

* [http://www.socher.org/uploads/Main/SocherHuangPenningtonNgManning_NIPS2011.pdf] Richard Socher, Eric Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. NIPS 2011

* [http://www.cs.toronto.edu/~hinton/absps/threenew.pdf] Mnih, A. and Hinton, G. E. Three New Graphical Models for Statistical Language Modelling. ICML 2007

Line 61:

Line 63:

* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_aistats2010.pdf] M. Ranzato, A. Krizhevsky, G. Hinton. Factored 3-Way Restricted Boltzmann Machines for Modeling Natural Images. In AISTATS 2010.

* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_cvpr2010.pdf] M. Ranzato, G. Hinton, Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines. CVPR 2010

-

** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one)

+

** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one [I didn't find it necessary, in fact the CVPR paper seemed easier to understand.])

* [http://www.cs.toronto.edu/~hinton/absps/mcphone.pdf] Dahl, G., Ranzato, M., Mohamed, A. and Hinton, G. E. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine. NIPS 2010.

* [http://www.nature.com/nature/journal/v457/n7225/pdf/nature07481.pdf] Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2008.

-

** (someone tell us if this should be here. Interesting algorithm + nice visualizations, though maybe slightly hard to understand.)

+

** (someone tell us if this should be here. Interesting algorithm + nice visualizations, though maybe slightly hard to understand. [seems a good reminder there are other existing models])

UFLDL Recommended Readings

From Ufldl

Latest revision as of 07:00, 18 February 2012

Views

Personal tools

ufldl resources

wiki

Search

Toolbox

@@ Line 13: / Line 13: @@
 * [http://www.cs.toronto.edu/~hinton/science.pdf]  Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 2006.
 ** If you want to play with the code, you can also find it at [http://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html].
-* [http://www-etud.iro.umontreal.ca/~larocheh/publications/greedy-deep-nets-nips-06.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006
+* [http://books.nips.cc/papers/files/nips19/NIPS2006_0739.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006
 * [http://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf] Pascal Vincent, Hugo Larochelle, Yoshua Bengio and Pierre-Antoine Manzagol. Extracting and Composing Robust Features with Denoising Autoencoders. ICML 2008.
-** (They have a nice model, but then backwards rationalize it into a probabilistic model.  Ignore the backwards rationalized probabilistic model.) (Someone please clarify eactly which section of the paper this is.)
+** (They have a nice model, but then backwards rationalize it into a probabilistic model.  Ignore the backwards rationalized probabilistic model [Section 4].)
 Analyzing deep learning/why does deep learning work:
 * [http://www.cs.toronto.edu/~larocheh/publications/deep-nets-icml-07.pdf] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. ICML 2007.
-** (Someone read this and let us know if this is worth keeping,.)
+** (Someone read this and let us know if this is worth keeping,. [Most model related material already covered by other papers, it seems not many impactful conclusions can be made from results, but can serve as reading for reinforcement for deep models])
 * [http://www.jmlr.org/papers/volume11/erhan10a/erhan10a.pdf] Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. Why Does Unsupervised Pre-training Help Deep Learning? JMLR 2010
 * [http://cs.stanford.edu/~ang/papers/nips09-MeasuringInvariancesDeepNetworks.pdf] Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng. Measuring invariances in deep networks. NIPS 2009.
@@ Line 28: / Line 28: @@
 * [http://deeplearning.net/tutorial/rbm.html] Tutorial on RBMs.
 ** But ignore the Theano code examples.
-** (Someone tell us if this should be moved later.  Useful for understanding some of DL literature, but not needed for many of the later papers?)
+** (Someone tell us if this should be moved later.  Useful for understanding some of DL literature, but not needed for many of the later papers? [Seems ok to leave in, useful introduction if reader had no idea about RBM's, and have to deal with Hinton's 06 Science paper or 3-way RBM's right away])
@@ Line 47: / Line 47: @@
 * [http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/57] Yoshua Bengio, Réjean Ducharme, Pascal Vincent and Christian Jauvin, A Neural Probabilistic Language Model. JMLR 2003.
 * [http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf] R. Collobert and J. Weston. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. ICML 2008.
+* [http://www.socher.org/uploads/Main/SocherPenningtonHuangNgManning_EMNLP2011.pdf] Richard Socher, Jeffrey Pennington, Eric Huang, Andrew Y. Ng, and Christopher D. Manning. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions. EMNLP 2011
+* [http://www.socher.org/uploads/Main/SocherHuangPenningtonNgManning_NIPS2011.pdf] Richard Socher, Eric Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. NIPS 2011
 * [http://www.cs.toronto.edu/~hinton/absps/threenew.pdf] Mnih, A. and Hinton, G. E. Three New Graphical Models for Statistical Language Modelling. ICML 2007
@@ Line 61: / Line 63: @@
 * [http://www.cs.toronto.edu/~ranzato/publications/ranzato_aistats2010.pdf] M. Ranzato, A. Krizhevsky, G. Hinton. Factored 3-Way Restricted Boltzmann Machines for Modeling Natural Images. In AISTATS 2010.
 * [http://www.cs.toronto.edu/~ranzato/publications/ranzato_cvpr2010.pdf] M. Ranzato, G. Hinton, Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines. CVPR 2010
-** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one)
+** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one [I didn't find it necessary, in fact the CVPR paper seemed easier to understand.])
 * [http://www.cs.toronto.edu/~hinton/absps/mcphone.pdf] Dahl, G., Ranzato, M., Mohamed, A. and Hinton, G. E. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine. NIPS 2010.
 * [http://www.nature.com/nature/journal/v457/n7225/pdf/nature07481.pdf] Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2008.
-** (someone tell us if this should be here.  Interesting algorithm + nice visualizations, though maybe slightly hard to understand.)
+** (someone tell us if this should be here.  Interesting algorithm + nice visualizations, though maybe slightly hard to understand. [seems a good reminder there are other existing models])