0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Minimal model of permutation symmetry in unsupervised learning

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Permutation of any two hidden units yields invariant properties in typical deep generative neural networks. This permutation symmetry plays an important role in understanding the computation performance of a broad class of neural networks with two or more hidden units. However, a theoretical study of the permutation symmetry is still lacking. Here, we propose a minimal model with only two hidden units in a restricted Boltzmann machine, which aims to address how the permutation symmetry affects the critical learning data size at which the concept-formation (or spontaneous symmetry breaking in physics language) starts, and moreover semi-rigorously prove a conjecture that the critical data size is independent of the number of hidden units once this number is finite. Remarkably, we find that the embedded correlation between two receptive fields of hidden units reduces the critical data size. In particular, the weakly-correlated receptive fields have the benefit of significantly reducing the minimal data size that triggers the transition, given less noisy data. Inspired by the theory, we also propose an efficient fully-distributed algorithm to infer the receptive fields of hidden units. Overall, our results demonstrate that the permutation symmetry is an interesting property that affects the critical data size for computation performances of related learning algorithms. All these effects can be analytically probed based on the minimal model, providing theoretical insights towards understanding unsupervised learning in a more general context.

          Related collections

          Most cited references19

          • Record: found
          • Abstract: not found
          • Article: not found

          Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            Representational power of restricted boltzmann machines and deep belief networks.

            Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.
              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Unsupervised Learning

                Bookmark

                Author and article information

                Journal
                30 April 2019
                Article
                1904.13052
                9cc2a507-7f4a-47a1-8144-0ff1f807e28c

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                33 pages, 103 equations, 4 figures
                cond-mat.dis-nn cond-mat.stat-mech cs.LG q-bio.NC stat.ML

                Condensed matter,Theoretical physics,Machine learning,Neurosciences,Artificial intelligence

                Comments

                Comment on this article