65
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Critical Behavior from Deep Dynamics: A Hidden Dimension in Natural Language

      Preprint
      ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          We show that in many data sequences - from texts in different languages to melodies and genomes - the mutual information between two symbols decays roughly like a power law with the number of symbols in between the two. In contrast, we prove that Markov/hidden Markov processes generically exhibit exponential decay in their mutual information, which explains why natural languages are poorly approximated by Markov processes. We present a broad class of models that naturally reproduce this critical behavior. They all involve deep dynamics of a recursive nature, as can be approximately implemented by tree-like or recurrent deep neural networks. This model class captures the essence of probabilistic context-free grammars as well as recursive self-reproduction in physical phenomena such as turbulence and cosmological inflation. We derive an analytic formula for the asymptotic power law and elucidate our results in a statistical physics context: 1-dimensional "shallow" models (such as Markov models or regular grammars) will fail to model natural language, because they cannot exhibit criticality, whereas "deep" models with one or more "hidden" dimensions representing levels of abstraction or scale can potentially succeed.

          Related collections

          Author and article information

          Journal
          2016-06-21
          2016-07-11
          Article
          1606.06737
          bd30a714-a798-4044-a03a-98ee88328d32

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          Theorem generalized to hidden Markov Models; connections added to linguistics literature on context-free grammars. References added. 17 pages, 5 figs
          cond-mat.dis-nn cs.CL

          Theoretical computer science,Theoretical physics
          Theoretical computer science, Theoretical physics

          Comments

          Comment on this article