44
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Studying the Wikipedia Hyperlink Graph for Relatedness and Disambiguation

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Hyperlinks and other relations in Wikipedia are a extraordinary resource which is still not fully understood. In this paper we study the different types of links in Wikipedia, and contrast the use of the full graph with respect to just direct links. We apply a well-known random walk algorithm on two tasks, word relatedness and named-entity disambiguation. We show that using the full graph is more effective than just direct links by a large margin, that non-reciprocal links harm performance, and that there is no benefit from categories and infoboxes, with coherent results on both tasks. We set new state-of-the-art figures for systems based on Wikipedia links, comparable to systems exploiting several information sources and/or supervised machine learning. Our approach is open source, with instruction to reproduce results, and amenable to be integrated with complementary text-based methods.

          Related collections

          Most cited references18

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Freebase

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            DBpedia - A crystallization point for the Web of Data

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              Contextual correlates of synonymy

                Bookmark

                Author and article information

                Journal
                2015-03-05
                2015-03-12
                Article
                1503.01655
                8b0a7b6b-2c2b-4df9-bef6-a51ff95ced03

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CL

                Theoretical computer science
                Theoretical computer science

                Comments

                Comment on this article