793
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      A Fast Learning Algorithm for Deep Belief Nets

      , 1 , 2
      Neural Computation
      MIT Press

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We show how to use “complementary priors” to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

          Related collections

          Most cited references11

          • Record: found
          • Abstract: not found
          • Article: not found

          Gradient-based learning applied to document recognition

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Training products of experts by minimizing contrastive divergence.

            It is possible to combine multiple latent-variable models of the same data by multiplying their probability distributions together and then renormalizing. This way of combining individual "expert" models makes it hard to generate samples from the combined model but easy to infer the values of the latent variables of each expert, because the combination rule ensures that the latent variables of different experts are conditionally independent when given the data. A product of experts (PoE) is therefore an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary. Training a PoE by maximizing the likelihood of the data is difficult because it is hard even to approximate the derivatives of the renormalization term in the combination rule. Fortunately, a PoE can be trained using a different objective function called "contrastive divergence" whose derivatives with regard to the parameters can be approximated accurately and efficiently. Examples are presented of contrastive divergence learning using several types of expert on several types of data.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Boosting a Weak Learning Algorithm by Majority

                Bookmark

                Author and article information

                Journal
                Neural Computation
                Neural Computation
                MIT Press
                0899-7667
                1530-888X
                July 2006
                July 2006
                : 18
                : 7
                : 1527-1554
                Affiliations
                [1 ]Department of Computer Science, University of Toronto, Toronto, Canada M5S 3G4
                [2 ]Department of Computer Science, National University of Singapore, Singapore 117543
                Article
                10.1162/neco.2006.18.7.1527
                16764513
                bc3147a5-e934-444e-87c1-e172f2143088
                © 2006
                History

                Comments

                Comment on this article