10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Emergent linguistic structure in artificial neural networks trained by self-supervision

      Read this article at

      ScienceOpenPublisherPMC
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This paper explores the knowledge of linguistic structure learned by large artificial neural networks, trained via self-supervision, whereby the model simply tries to predict a masked word in a given context. Human language communication is via sequences of words, but language understanding requires constructing rich hierarchical structures that are never observed explicitly. The mechanisms for this have been a prime mystery of human language acquisition, while engineering work has mainly proceeded by supervised learning on treebanks of sentences hand labeled for this latent structure. However, we demonstrate that modern deep contextual language models learn major aspects of this structure, without any explicit supervision. We develop methods for identifying linguistic hierarchical structure emergent in artificial neural networks and demonstrate that components in these models focus on syntactic grammatical relationships and anaphoric coreference. Indeed, we show that a linear transformation of learned embeddings in these models captures parse tree distances to a surprising degree, allowing approximate reconstruction of the sentence tree structures normally assumed by linguists. These results help explain why these models have brought such large improvements across many language-understanding tasks.

          Related collections

          Most cited references25

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          A Fast and Accurate Dependency Parser using Neural Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              What Does BERT Look at? An Analysis of BERT’s Attention

                Bookmark

                Author and article information

                Journal
                Proceedings of the National Academy of Sciences
                Proc Natl Acad Sci USA
                Proceedings of the National Academy of Sciences
                0027-8424
                1091-6490
                June 03 2020
                : 201907367
                Article
                10.1073/pnas.1907367117
                7720155
                32493748
                b8a0313a-af22-4278-82a7-be5e97bf1d50
                © 2020

                Free to read

                https://www.pnas.org/site/aboutpnas/licenses.xhtml

                History

                Comments

                Comment on this article