ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

4

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Video-to-Video Translation for Visual Speech Synthesis

Preprint

Author(s): Michail C. Doukas , Viktoriia Sharmanska , Stefanos Zafeiriou

Publication date Created: 28 May 2019

Read this article at

ScienceOpen ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Despite remarkable success in image-to-image translation that celebrates the advancements of generative adversarial networks (GANs), very limited attempts are known for video domain translation. We study the task of video-to-video translation in the context of visual speech generation, where the goal is to transform an input video of any spoken word to an output video of a different word. This is a multi-domain translation, where each word forms a domain of videos uttering this word. Adaptation of the state-of-the-art image-to-image translation model (StarGAN) to this setting falls short with a large vocabulary size. Instead we propose to use character encodings of the words and design a novel character-based GANs architecture for video-to-video translation called Visual Speech GAN (ViSpGAN). We are the first to demonstrate video-to-video translation with a vocabulary of 500 words.

Related collections

Most cited references 17

Record: found
Abstract: not found
Conference Proceedings: not found

Image-to-Image Translation with Conditional Adversarial Networks

Phillip Isola, Jun-Yan Zhu, Tinghui Zhou … (2017)

0 comments Cited 1277 times – based on 0 reviews

Record: found
Abstract: not found
Book Chapter: not found

Identity Mappings in Deep Residual Networks

Kaiming He, Xiangyu Zhang, Shaoqing Ren … (2016)

0 comments Cited 861 times – based on 0 reviews

Record: found
Abstract: not found
Conference Proceedings: not found

StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation

Yunjey Choi, Minje Choi, Munyoung Kim … (2018)

0 comments Cited 243 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 28 May 2019

Article

ArXiV ID: 1905.12043

SO-VID: 5ad9caa6-0aae-4a07-9156-9ec075afd2d5

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Categories cs.CV

ScienceOpen disciplines: Computer vision & Pattern recognition

Data availability:

ScienceOpen disciplines: Computer vision & Pattern recognition

Comments

Comment on this article