This paper addresses the problem of accessing very large heterogeneous document collections by proposing a new approach to using clustering for information retrieval: mediated access through a clustered collection . In what is actually an information access environment, the user can explore a relatively small, well structured, pre-clustered collection covering a particular subject domain, in order to understand the concepts encompassed and to clarify and refine his/her information need. The user can ostensively indicate clusters and documents of interest and be assisted in formulating a query, based on which a search can be done on a large, non-structured collection. Finally, the original cluster structure is the basis for visualisation tools that allow the user to explore search results. WebCluster, the system implementing these ideas, is presented, together with results of an initial formative experiment and plans for future experiments.
Content
Author and article information
Contributors
David J. Harper
Mourad Mechkour
Gheorghe Muresan
Conference
Publication date:
April
1999
Publication date
(Print):
April
1999
Pages: 1-13
Affiliations
[0001]School of Computer and Mathematical Sciences, The Robert Gordon University
Aberdeen AB25 1HG, Scotland, UK