Blog
About

120
views
0
recommends
+1 Recommend
1 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      An Architecture for Efficient Document Clustering and Retrieval on a Dynamic Collection of Newspaper Texts

      , , ,

      20th Annual BCS-IRSG Colloquium on IR (IRSG)

      BCS-IRSG Colloquium on IR

      25-27 March 1998

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Clustering of related or similar objects has long been regarded as a potentially useful contribution to helping users navigate an information space such as a document collection. When documents are related by virtue of being about the same or similar topics, then this is often a good indicator that they will be relevant to the same queries and this can be used during the retrieval operation. Many clustering algorithms and techniques have been developed and implemented since the earliest days of computational information retrieval but as the sizes of document collections have grown these techniques have not been scaled to large collections because of their computational overhead. In this paper we describe a technique for clustering a collection of documents such as a collection of online newspapers which uses a number of short-cuts to make the process computable for large collections. Furthermore, our design is extensible in that it caters for a dynamic collection of documents which would be periodically, perhaps nightly, updated, amended or have deletions. An implementation of the clustering on an archive of the Irish Times newspaper is reported here.

          Related collections

          Most cited references 1

          • Record: found
          • Abstract: not found
          • Article: not found

          Automatic association of news items

            Bookmark

            Author and article information

            Contributors
            Conference
            March 1998
            March 1998
            : 1-9
            Affiliations
            School of Computer Applications

            Dublin City University

            Glasnevin, Dublin 9, IRELAND
            Article
            10.14236/ewic/IRSG1998.10
            © Alan F. Smeaton et al. Published by BCS Learning and Development Ltd. 20th Annual BCS-IRSG Colloquium on IR, Autrans, France

            This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

            20th Annual BCS-IRSG Colloquium on IR
            IRSG
            20
            Autrans, France
            25-27 March 1998
            Electronic Workshops in Computing (eWiC)
            BCS-IRSG Colloquium on IR
            Product
            Product Information: 1477-9358BCS Learning & Development
            Self URI (journal page): https://ewic.bcs.org/
            Categories
            Electronic Workshops in Computing

            Comments

            Comment on this article