Blog
About

113
views
0
recommends
+1 Recommend
1 collections
    4
    shares
      • Record: found
      • Abstract: found
      • Conference Proceedings: found
      Is Open Access

      Summarization of Changes in Dynamic Text Collections

      Fifth BCS-IRSG Symposium on Future Directions in Information Access (FDIA 2013) (FDIA)

      Future Directions in Information Access (FDIA 2013)

      3 September 2013

      Changes summarization, Update summarization, Dynamic text collections

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Information Retrieval is the Informatics field primarily focused on all problems and challenges related to information storage and access. The large majority of works in this area are based on static collections of documents. However, many of these collections are dynamic, and have evolved over time with documents being added, edited or simply removed at different times. Even in highly dynamic environments such as the World Wide Web, research tends to be centered on the most recent version of the documents and all the past information is normally discarded. Recognizing these changes over dynamic text collections and exploiting them for document retrieval and presentation purposes introduce new and relevant research challenges. This paper addresses the opportunity that gains relevance in this context – summarization of changes in dynamic text collections. We first define the problem in order to produce a summary that describes textual changes to an entire document or a set of related documents over an user defined time period. Then, from literature we present an extensive overview of the relevant approaches depicting similar problems and at last some discussions including future aspects.

          Related collections

          Most cited references 11

          • Record: found
          • Abstract: not found
          • Article: not found

          The Automatic Creation of Literature Abstracts

           H. P. Luhn (1958)
            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            LexRank: Graph-based Lexical Centrality as Salience in Text Summarization

             G. Erkan,  D. R. Radev (2004)
            We introduce a stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. We test the technique on the problem of Text Summarization (TS). Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence. We consider a new approach, LexRank, for computing sentence importance based on the concept of eigenvector centrality in a graph representation of sentences. In this model, a connectivity matrix based on intra-sentence cosine similarity is used as the adjacency matrix of the graph representation of sentences. Our system, based on LexRank ranked in first place in more than one task in the recent DUC 2004 evaluation. In this paper we present a detailed analysis of our approach and apply it to a larger data set including data from earlier DUC evaluations. We discuss several methods to compute centrality using the similarity graph. The results show that degree-based methods (including LexRank) outperform both centroid-based methods and other systems participating in DUC in most of the cases. Furthermore, the LexRank with threshold method outperforms the other degree-based techniques including continuous LexRank. We also show that our approach is quite insensitive to the noise in the data that may result from an imperfect topical clustering of documents.
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Exploring content models for multi-document summarization

                Bookmark

                Author and article information

                Contributors
                Conference
                September 2013
                September 2013
                : 14-19
                Affiliations
                PhD in Computer Science (MAP-i), INESC TEC, Universidade do Porto

                Rua Dr. Roberto Frias, s/n 4200-465 Porto, Portugal
                Article
                10.14236/ewic/FDIA2013.4
                © Manika Kar. Published by BCS Learning and Development Ltd. Fifth BCS-IRSG Symposium on Future Directions in Information Access (FDIA 2013), Granada, Spain

                This work is licensed under a Creative Commons Attribution 4.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/

                Fifth BCS-IRSG Symposium on Future Directions in Information Access (FDIA 2013)
                FDIA
                5
                Granada, Spain
                3 September 2013
                Electronic Workshops in Computing (eWiC)
                Future Directions in Information Access (FDIA 2013)
                Product
                Product Information: 1477-9358BCS Learning & Development
                Self URI (journal page): https://ewic.bcs.org/
                Categories
                Electronic Workshops in Computing

                Comments

                Comment on this article