4
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Video-to-Video Translation for Visual Speech Synthesis

      Preprint
      , ,

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Despite remarkable success in image-to-image translation that celebrates the advancements of generative adversarial networks (GANs), very limited attempts are known for video domain translation. We study the task of video-to-video translation in the context of visual speech generation, where the goal is to transform an input video of any spoken word to an output video of a different word. This is a multi-domain translation, where each word forms a domain of videos uttering this word. Adaptation of the state-of-the-art image-to-image translation model (StarGAN) to this setting falls short with a large vocabulary size. Instead we propose to use character encodings of the words and design a novel character-based GANs architecture for video-to-video translation called Visual Speech GAN (ViSpGAN). We are the first to demonstrate video-to-video translation with a vocabulary of 500 words.

          Related collections

          Most cited references17

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Image-to-Image Translation with Conditional Adversarial Networks

            Bookmark
            • Record: found
            • Abstract: not found
            • Book Chapter: not found

            Identity Mappings in Deep Residual Networks

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation

                Bookmark

                Author and article information

                Journal
                28 May 2019
                Article
                1905.12043
                5ad9caa6-0aae-4a07-9156-9ec075afd2d5

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                cs.CV

                Computer vision & Pattern recognition
                Computer vision & Pattern recognition

                Comments

                Comment on this article