14
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Word Mover's Embedding: From Word2Vec to Document Embedding

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          While the celebrated Word2Vec technique yields semantically rich representations for individual words, there has been relatively less success in extending to generate unsupervised sentences or documents embeddings. Recent work has demonstrated that a distance measure between documents called \emph{Word Mover's Distance} (WMD) that aligns semantically similar words, yields unprecedented KNN classification accuracy. However, WMD is expensive to compute, and it is hard to extend its use beyond a KNN classifier. In this paper, we propose the \emph{Word Mover's Embedding } (WME), a novel approach to building an unsupervised document (sentence) embedding from pre-trained word embeddings. In our experiments on 9 benchmark text classification datasets and 22 textual similarity tasks, the proposed technique consistently matches or outperforms state-of-the-art techniques, with significantly higher accuracy on problems of short length.

          Related collections

          Most cited references13

          • Record: found
          • Abstract: not found
          • Article: not found

          Term-weighting approaches in automatic text retrieval

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The Distribution of a Product from Several Sources to Numerous Localities

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Deep Unordered Composition Rivals Syntactic Methods for Text Classification

                Bookmark

                Author and article information

                Journal
                30 October 2018
                Article
                1811.01713
                9c739bd6-5abc-4e0e-a116-964bbf90ea5f

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                EMNLP'18 Camera-Ready Version
                cs.CL cs.AI cs.LG stat.ML

                Theoretical computer science,Machine learning,Artificial intelligence
                Theoretical computer science, Machine learning, Artificial intelligence

                Comments

                Comment on this article