0
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Salient Phrase Aware Dense Retrieval: Can a Dense Retriever Imitate a Sparse One?

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Despite their recent popularity and well known advantages, dense retrievers still lag behind sparse methods such as BM25 in their ability to reliably match salient phrases and rare entities in the query. It has been argued that this is an inherent limitation of dense models. We disprove this claim by introducing the Salient Phrase Aware Retriever (SPAR), a dense retriever with the lexical matching capacity of a sparse model. In particular, we show that a dense retriever {\Lambda} can be trained to imitate a sparse one, and SPAR is built by augmenting a standard dense retriever with {\Lambda}. When evaluated on five open-domain question answering datasets and the MS MARCO passage retrieval task, SPAR sets a new state of the art for dense and sparse retrievers and can match or exceed the performance of more complicated dense-sparse hybrid systems.

          Related collections

          Author and article information

          Journal
          13 October 2021
          Article
          2110.06918
          d61e965e-58da-4ce6-8a6f-00cc1b7da5a6

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          Custom metadata
          cs.CL cs.IR cs.LG

          Theoretical computer science,Information & Library science,Artificial intelligence

          Comments

          Comment on this article