1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Sequence-to-Sequence Baseline for the Voice Conversion Challenge 2020: Cascading ASR and TTS

      Preprint
      , , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          This paper presents the sequence-to-sequence (seq2seq) baseline system for the voice conversion challenge (VCC) 2020. We consider a naive approach for voice conversion (VC), which is to first transcribe the input speech with an automatic speech recognition (ASR) model, followed using the transcriptions to generate the voice of the target with a text-to-speech (TTS) model. We revisit this method under a sequence-to-sequence (seq2seq) framework by utilizing ESPnet, an open-source end-to-end speech processing toolkit, and the many well-configured pretrained models provided by the community. Official evaluation results show that our system comes out top among the participating systems in terms of conversion similarity, demonstrating the promising ability of seq2seq models to convert speaker identity. The implementation is made open-source at: https://github.com/espnet/espnet/tree/master/egs/vcc20.

          Related collections

          Author and article information

          Journal
          05 October 2020
          Article
          2010.02434
          2aa81cd3-8bd0-4816-bc28-ff105a2d5a93

          http://arxiv.org/licenses/nonexclusive-distrib/1.0/

          History
          Custom metadata
          Accepted to the ISCA Joint Workshop for the Blizzard Challenge and Voice Conversion Challenge 2020
          eess.AS cs.CL cs.SD

          Theoretical computer science,Graphics & Multimedia design,Electrical engineering

          Comments

          Comment on this article