We present in this paper a combination of Machine Learning based Information Retrieval (IR) techniques and stochastic language modelling in a hierarchical system that extracts surface information from text. At the lowest level of this hierarchy, documents and paragraphs are successively routed with IR techniques. At the top level, a stochastic language model extracts the most relevant phrases, and labels the type of information they contain. The approach and preliminary results are demonstrated on a subset of the MUC-6 Scenario Templates task.
Content
Author and article information
Contributors
Hugo Zaragoza
Patrick Gallinari
Conference
Publication date:
March
1998
Publication date
(Print):
March
1998
Pages: 1-9
Affiliations
[0001]LIP6, Université Pierre et Marie Curie,
4, place Jussieu F-75252,
PARIS cedex 05 (F).