10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Align, Mask and Select: A Simple Method for Incorporating Commonsense Knowledge into Language Representation Models

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Neural language representation models such as Bidirectional Encoder Representations from Transformers (BERT) pre-trained on large-scale corpora can well capture rich semantics from plain text, and can be fine-tuned to consistently improve the performance on various natural language processing (NLP) tasks. However, the existing pre-trained language representation models rarely consider explicitly incorporating commonsense knowledge or other knowledge. In this paper, we develop a pre-training approach for incorporating commonsense knowledge into language representation models. We construct a commonsense-related multi-choice question answering dataset for pre-training a neural language representation model. The dataset is created automatically by our proposed "align, mask, and select" (AMS) method. We also investigate different pre-training tasks. Experimental results demonstrate that pre-training models using the proposed approach followed by fine-tuning achieves significant improvements on various commonsense-related tasks, such as CommonsenseQA and Winograd Schema Challenge, while maintaining comparable performance on other NLP tasks, such as sentence classification and natural language inference (NLI) tasks, compared to the original BERT models.

          Related collections

          Most cited references1

          • Record: found
          • Abstract: not found
          • Book Chapter: not found

          Modeling Relations and Their Mentions without Labeled Text

            Bookmark

            Author and article information

            Journal
            19 August 2019
            Article
            1908.06725
            f82667c0-5b94-418e-b4f1-941f7b701d31

            http://arxiv.org/licenses/nonexclusive-distrib/1.0/

            History
            Custom metadata
            cs.CL

            Theoretical computer science
            Theoretical computer science

            Comments

            Comment on this article