17
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A large annotated corpus for learning natural language inference

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Understanding entailment and contradiction is fundamental to understanding natural language, and inference about entailment and contradiction is a valuable testing ground for the development of semantic representations. However, machine learning research in this area has been dramatically limited by the lack of large-scale resources. To address this, we introduce the Stanford Natural Language Inference corpus, a new, freely available collection of labeled sentence pairs, written by humans doing a novel grounded task based on image captioning. At 570K pairs, it is two orders of magnitude larger than all other resources of its type. This increase in scale allows lexicalized classifiers to outperform some sophisticated existing entailment models, and it allows a neural network-based model to perform competitively on natural language inference benchmarks for the first time.

          Related collections

          Author and article information

          Journal
          21 August 2015
          Article
          1508.05326
          4821ac4e-f2e7-4bc3-a8e8-5549cd66af14

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          To appear at EMNLP 2015. The data will be posted shortly before the conference (the week of 14 Sep) at http://nlp.stanford.edu/projects/snli/
          cs.CL

          Comments

          Comment on this article