15
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Pangloss: Fast Entity Linking in Noisy Text Environments

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Entity linking is the task of mapping potentially ambiguous terms in text to their constituent entities in a knowledge base like Wikipedia. This is useful for organizing content, extracting structured data from textual documents, and in machine learning relevance applications like semantic search, knowledge graph construction, and question answering. Traditionally, this work has focused on text that has been well-formed, like news articles, but in common real world datasets such as messaging, resumes, or short-form social media, non-grammatical, loosely-structured text adds a new dimension to this problem. This paper presents Pangloss, a production system for entity disambiguation on noisy text. Pangloss combines a probabilistic linear-time key phrase identification algorithm with a semantic similarity engine based on context-dependent document embeddings to achieve better than state-of-the-art results (>5% in F1) compared to other research or commercially available systems. In addition, Pangloss leverages a local embedded database with a tiered architecture to house its statistics and metadata, which allows rapid disambiguation in streaming contexts and on-device disambiguation in low-memory environments such as mobile phones.

          Related collections

          Most cited references15

          • Record: found
          • Abstract: not found
          • Book Chapter: not found

          DBpedia: A Nucleus for a Web of Open Data

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Entity Linking with a Knowledge Base: Issues, Techniques, and Solutions

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Learning to link with wikipedia

                Bookmark

                Author and article information

                Journal
                16 July 2018
                Article
                1807.06036
                8f03da7e-7199-42ae-8ee4-bb50e297b404

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                KDD 2018
                cs.IR cs.LG stat.ML

                Information & Library science,Machine learning,Artificial intelligence
                Information & Library science, Machine learning, Artificial intelligence

                Comments

                Comment on this article