The novelty task consists of finding relevant and novel sentences in a ranking of documents given a query. In the literature, different techniques have been applied to address this problem. Nevertheless, little is known about Language Models for novelty detection and, especially, the effect of smoothing on the selection of novel sentences. Language Models can be used to study novelty and relevance in a principled way. These statistical models have been shown to perform well empirically in many Information Retrieval tasks. In this work we study formally the effects of smoothing on novelty detection. To this aim, we compare different techniques based on the Kullback-Leibler divergence and we analyze the sensitivity of retrieval performance to the smoothing parameters. The ability of Language Modeling estimation methods to handle quantitatively the uncertainty associated to the use of natural language is a powerful tool that can drive the future development of novelty-based mechanisms.
Content
Author and article information
Contributors
Ronald T. Fernández
Conference
Publication date:
August
2007
Publication date
(Print):
August
2007
Pages: 1-6
Affiliations
[0001]Departamento de Electrónica y Computación.
Universidad de Santiago de Compostela
Campus Sur, s/n.
15782 Santiago de Compostela, SPAIN