1
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      GANSynth: Adversarial Neural Audio Synthesis

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Efficient audio synthesis is an inherently difficult machine learning task, as human perception is sensitive to both global structure and fine-scale waveform coherence. Autoregressive models, such as WaveNet, model local structure at the expense of global latent structure and slow iterative sampling, while Generative Adversarial Networks (GANs), have global latent conditioning and efficient parallel sampling, but struggle to generate locally-coherent audio waveforms. Herein, we demonstrate that GANs can in fact generate high-fidelity and locally-coherent audio by modeling log magnitudes and instantaneous frequencies with sufficient frequency resolution in the spectral domain. Through extensive empirical investigations on the NSynth dataset, we demonstrate that GANs are able to outperform strong WaveNet baselines on automated and human evaluation metrics, and efficiently generate audio several orders of magnitude faster than their autoregressive counterparts.

          Related collections

          Most cited references4

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Deep Learning Face Attributes in the Wild

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            Estimating and interpreting the instantaneous frequency of a signal. II. Algorithms and applications

            B Boashash (1992)
              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              Tacotron: Towards End-to-End Speech Synthesis

                Bookmark

                Author and article information

                Journal
                22 February 2019
                Article
                1902.08710
                0460b260-bd30-41b2-8c36-5ac1dc598458

                http://arxiv.org/licenses/nonexclusive-distrib/1.0/

                History
                Custom metadata
                Colab Notebook: http://goo.gl/magenta/gansynth-demo
                cs.SD cs.LG eess.AS stat.ML

                Machine learning,Artificial intelligence,Electrical engineering,Graphics & Multimedia design

                Comments

                Comment on this article