Patent retrieval is an information retrieval task that poses very specific characteristics and demands. Especially the need for high recall is very important to patent searchers. In the ongoing research project TM4IP, we aim to improve patent retrieval by developing an open-domain patent retrieval system based on linguistic knowledge. By using Dependency Triplets as index terms our system aims to improve precision and recall compared to keyword-based approaches. One of the cornerstones of a syntactic approach to Information Retrieval is normalisation. This paper describes some of the characteristics of the patent domain that influence lexical normalisation.
Content
Author and article information
Contributors
Eva D’hondt
Conference
Publication date:
September
2009
Publication date
(Print):
September
2009
Pages: 102-109
Affiliations
[0001]Centre for Language and Speech Technology
Radboud University Nijmegen, The Netherlands