5
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      A Closer Look at Data Bias in Neural Extractive Summarization Models

      Preprint
      , , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models. Specifically, we first propose several properties of datasets, which matter for the generalization of summarization models. Then we build the connection between priors residing in datasets and model designs, analyzing how different properties of datasets influence the choices of model structure design and training methods. Finally, by taking a typical dataset as an example, we rethink the process of the model design based on the experience of the above analysis. We demonstrate that when we have a deep understanding of the characteristics of datasets, a simple approach can bring significant improvements to the existing state-of-the-art model.A

          Related collections

          Most cited references12

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Get To The Point: Summarization with Pointer-Generator Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Deep Semantic Role Labeling: What Works and What’s Next

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Neural Summarization by Extracting Sentences and Words

                Bookmark

                Author and article information

                Journal
                30 September 2019
                Article
                1909.13705
                8357b585-ea86-4959-bc3e-550c804a21d4

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                EMNLP 2019 Workshop on New Frontiers in Summarization
                cs.CL

                Theoretical computer science
                Theoretical computer science

                Comments

                Comment on this article