8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Europarl-ST: A Multilingual Corpus For Speech Translation Of Parliamentary Debates

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Current research into spoken language translation (SLT) is often hampered by the lack of specific data resources for this task, as currently available SLT datasets are restricted to a limited set of language pairs. In this paper we present Europarl-ST, a novel multilingual SLT corpus containing paired audio-text samples for SLT from and into 6 European languages, for a total of 30 different translation directions. This corpus has been compiled using the debates held in the European Parliament in the period between 2008 and 2012. This paper describes the corpus creation process and presents a series of automatic speech recognition, machine translation and spoken language translation experiments that highlight the potential of this new resource. The corpus is released under a Creative Commons license and is freely accessible and downloadable.

          Related collections

          Most cited references2

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          NMT-Based Segmentation and Punctuation Insertion for Real-Time Spoken Language Translation

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Direct Speech-to-Speech Translation with a Sequence-to-Sequence Model

              Bookmark

              Author and article information

              Journal
              08 November 2019
              Article
              1911.03167
              637e9caa-5782-4f9c-8f51-a7a903062898

              http://creativecommons.org/licenses/by-nc-sa/4.0/

              History
              Custom metadata
              Submitted to ICASSP2020
              cs.CL cs.SD eess.AS

              Theoretical computer science,Electrical engineering,Graphics & Multimedia design

              Comments

              Comment on this article