UFLDL Recommended Readings

From Ufldl

Jump to: navigation, search
 
Line 2: Line 2:
The basics:  
The basics:  
-
* [[http://cs294a.stanford.edu CS294A]] neural network/sparse autoencoder tutorial. (Most of this is now in the [[UFLDL Tutorial]], but the exercise is still on the CS294A website.)  
+
* [[http://cs294a.stanford.edu CS294A]] Neural Networks/Sparse Autoencoder Tutorial. (Most of this is now in the [[UFLDL Tutorial]], but the exercise is still on the CS294A website.)  
* [http://www.naturalimagestatistics.net/] Natural Image Statistics book, Hyvarinen et al.   
* [http://www.naturalimagestatistics.net/] Natural Image Statistics book, Hyvarinen et al.   
** This is long, so just skim or skip the chapters that you already know.   
** This is long, so just skim or skip the chapters that you already know.   
Line 11: Line 11:
Autoencoders:  
Autoencoders:  
-
* [http://www.cs.toronto.edu/~hinton/science.pdf]  Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 2006.  If you want to play with the code, you can also find it at [http://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html].  
+
* [http://www.cs.toronto.edu/~hinton/science.pdf]  Hinton, G. E. and Salakhutdinov, R. R. Reducing the dimensionality of data with neural networks. Science 2006.   
-
* [http://www-etud.iro.umontreal.ca/~larocheh/publications/greedy-deep-nets-nips-06.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006  
+
** If you want to play with the code, you can also find it at [http://www.cs.toronto.edu/~hinton/MatlabForSciencePaper.html].  
 +
* [http://books.nips.cc/papers/files/nips19/NIPS2006_0739.pdf] Bengio, Y., Lamblin, P., Popovici, P., Larochelle, H. Greedy Layer-Wise Training of Deep Networks. NIPS 2006  
* [http://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf] Pascal Vincent, Hugo Larochelle, Yoshua Bengio and Pierre-Antoine Manzagol. Extracting and Composing Robust Features with Denoising Autoencoders. ICML 2008.   
* [http://www.cs.toronto.edu/~larocheh/publications/icml-2008-denoising-autoencoders.pdf] Pascal Vincent, Hugo Larochelle, Yoshua Bengio and Pierre-Antoine Manzagol. Extracting and Composing Robust Features with Denoising Autoencoders. ICML 2008.   
-
** (They have a nice model, but then backwards rationalize it into a probabilistic model.  Ignore the backwards rationalized probabilistic model.) (Someone please clarify eactly which section of the paper this is.)
+
** (They have a nice model, but then backwards rationalize it into a probabilistic model.  Ignore the backwards rationalized probabilistic model [Section 4].)  
Analyzing deep learning/why does deep learning work:  
Analyzing deep learning/why does deep learning work:  
-
* Larochelle, Erhan, Courville, Bergstra, Bengio, ICML 2007. (Someone read this and let us know if this is worth keeping,.)  
+
* [http://www.cs.toronto.edu/~larocheh/publications/deep-nets-icml-07.pdf] H. Larochelle, D. Erhan, A. Courville, J. Bergstra, and Y. Bengio. An Empirical Evaluation of Deep Architectures on Problems with Many Factors of Variation. ICML 2007.
 +
** (Someone read this and let us know if this is worth keeping,. [Most model related material already covered by other papers, it seems not many impactful conclusions can be made from results, but can serve as reading for reinforcement for deep models])  
* [http://www.jmlr.org/papers/volume11/erhan10a/erhan10a.pdf] Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. Why Does Unsupervised Pre-training Help Deep Learning? JMLR 2010   
* [http://www.jmlr.org/papers/volume11/erhan10a/erhan10a.pdf] Dumitru Erhan, Yoshua Bengio, Aaron Courville, Pierre-Antoine Manzagol, Pascal Vincent, and Samy Bengio. Why Does Unsupervised Pre-training Help Deep Learning? JMLR 2010   
* [http://cs.stanford.edu/~ang/papers/nips09-MeasuringInvariancesDeepNetworks.pdf] Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng. Measuring invariances in deep networks. NIPS 2009.  
* [http://cs.stanford.edu/~ang/papers/nips09-MeasuringInvariancesDeepNetworks.pdf] Ian J. Goodfellow, Quoc V. Le, Andrew M. Saxe, Honglak Lee and Andrew Y. Ng. Measuring invariances in deep networks. NIPS 2009.  
Line 26: Line 28:
* [http://deeplearning.net/tutorial/rbm.html] Tutorial on RBMs.  
* [http://deeplearning.net/tutorial/rbm.html] Tutorial on RBMs.  
** But ignore the Theano code examples.
** But ignore the Theano code examples.
-
** (Someone tell us if this should be moved later.  Useful for understanding some of DL literature, but not needed for many of the later papers?)
+
** (Someone tell us if this should be moved later.  Useful for understanding some of DL literature, but not needed for many of the later papers? [Seems ok to leave in, useful introduction if reader had no idea about RBM's, and have to deal with Hinton's 06 Science paper or 3-way RBM's right away])
 +
 
 +
 
 +
Convolution Networks:
 +
* [http://deeplearning.net/tutorial/lenet.html] Tutorial on Convolution Neural Networks.
 +
** But ignore the Theano code examples.
Line 40: Line 47:
* [http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/57] Yoshua Bengio, Réjean Ducharme, Pascal Vincent and Christian Jauvin, A Neural Probabilistic Language Model. JMLR 2003.
* [http://www.iro.umontreal.ca/~lisa/publications2/index.php/attachments/single/57] Yoshua Bengio, Réjean Ducharme, Pascal Vincent and Christian Jauvin, A Neural Probabilistic Language Model. JMLR 2003.
* [http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf] R. Collobert and J. Weston. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. ICML 2008.
* [http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf] R. Collobert and J. Weston. A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. ICML 2008.
 +
* [http://www.socher.org/uploads/Main/SocherPenningtonHuangNgManning_EMNLP2011.pdf] Richard Socher, Jeffrey Pennington, Eric Huang, Andrew Y. Ng, and Christopher D. Manning. Semi-Supervised Recursive Autoencoders for Predicting Sentiment Distributions. EMNLP 2011
 +
* [http://www.socher.org/uploads/Main/SocherHuangPenningtonNgManning_NIPS2011.pdf] Richard Socher, Eric Huang, Jeffrey Pennington, Andrew Y. Ng, and Christopher D. Manning. Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection. NIPS 2011
* [http://www.cs.toronto.edu/~hinton/absps/threenew.pdf] Mnih, A. and Hinton, G. E. Three New Graphical Models for Statistical Language Modelling. ICML 2007
* [http://www.cs.toronto.edu/~hinton/absps/threenew.pdf] Mnih, A. and Hinton, G. E. Three New Graphical Models for Statistical Language Modelling. ICML 2007
Line 54: Line 63:
* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_aistats2010.pdf] M. Ranzato, A. Krizhevsky, G. Hinton. Factored 3-Way Restricted Boltzmann Machines for Modeling Natural Images. In AISTATS 2010.
* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_aistats2010.pdf] M. Ranzato, A. Krizhevsky, G. Hinton. Factored 3-Way Restricted Boltzmann Machines for Modeling Natural Images. In AISTATS 2010.
* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_cvpr2010.pdf] M. Ranzato, G. Hinton, Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines. CVPR 2010  
* [http://www.cs.toronto.edu/~ranzato/publications/ranzato_cvpr2010.pdf] M. Ranzato, G. Hinton, Modeling Pixel Means and Covariances Using Factorized Third-Order Boltzmann Machines. CVPR 2010  
-
** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one)
+
** (someone and tell us if you need to read the 3-way RBM paper before the mcRBM one [I didn't find it necessary, in fact the CVPR paper seemed easier to understand.])
* [http://www.cs.toronto.edu/~hinton/absps/mcphone.pdf] Dahl, G., Ranzato, M., Mohamed, A. and Hinton, G. E. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine. NIPS 2010.
* [http://www.cs.toronto.edu/~hinton/absps/mcphone.pdf] Dahl, G., Ranzato, M., Mohamed, A. and Hinton, G. E. Phone Recognition with the Mean-Covariance Restricted Boltzmann Machine. NIPS 2010.
* [http://www.nature.com/nature/journal/v457/n7225/pdf/nature07481.pdf] Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2008.
* [http://www.nature.com/nature/journal/v457/n7225/pdf/nature07481.pdf] Y. Karklin and M. S. Lewicki, Emergence of complex cell properties by learning to generalize in natural scenes, Nature, 2008.
-
** (someone tell us if this should be here.  Interesting algorithm + nice visualizations, though maybe slightly hard to understand.)  
+
** (someone tell us if this should be here.  Interesting algorithm + nice visualizations, though maybe slightly hard to understand. [seems a good reminder there are other existing models])  
Overview
Overview
-
* [http://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf] Yoshua Bengio. Learning Deep Architectures for AI. FTML 2009. (Broad landscape description of the field, but technical details there are hard to follow so ignore that.  This is also easier to read after you've gone over some of literature of the field.)
+
* [http://www.iro.umontreal.ca/~bengioy/papers/ftml_book.pdf] Yoshua Bengio. Learning Deep Architectures for AI. FTML 2009.  
 +
** (Broad landscape description of the field, but technical details there are hard to follow so ignore that.  This is also easier to read after you've gone over some of literature of the field.)

Latest revision as of 07:00, 18 February 2012

Personal tools