10
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Leveraging Acoustic Cues and Paralinguistic Embeddings to Detect Expression from Voice

      Preprint

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Millions of people reach out to digital assistants such as Siri every day, asking for information, making phone calls, seeking assistance, and much more. The expectation is that such assistants should understand the intent of the users query. Detecting the intent of a query from a short, isolated utterance is a difficult task. Intent cannot always be obtained from speech-recognized transcriptions. A transcription driven approach can interpret what has been said but fails to acknowledge how it has been said, and as a consequence, may ignore the expression present in the voice. Our work investigates whether a system can reliably detect vocal expression in queries using acoustic and paralinguistic embedding. Results show that the proposed method offers a relative equal error rate (EER) decrease of 60% compared to a bag-of-word based system, corroborating that expression is significantly represented by vocal attributes, rather than being purely lexical. Addition of emotion embedding helped to reduce the EER by 30% relative to the acoustic embedding, demonstrating the relevance of emotion in expressive voice.

          Related collections

          Most cited references6

          • Record: found
          • Abstract: not found
          • Article: not found

          Building Naturalistic Emotionally Balanced Speech Corpus by Retrieving Emotional Speech From Existing Podcast Recordings

            Bookmark
            • Record: found
            • Abstract: not found
            • Conference Proceedings: not found

            Zero-shot learning of intent embeddings for expansion by convolutional deep structured semantic models

              Bookmark
              • Record: found
              • Abstract: not found
              • Conference Proceedings: not found

              The SRI AVEC-2014 Evaluation System

                Bookmark

                Author and article information

                Journal
                28 June 2019
                Article
                1907.00112
                ce6dae2e-9ff7-4bee-8e5e-f64aec23d587

                http://creativecommons.org/licenses/by-nc-sa/4.0/

                History
                Custom metadata
                5 pages, 6 figures
                cs.CL cs.LG cs.SD eess.AS

                Theoretical computer science,Artificial intelligence,Electrical engineering,Graphics & Multimedia design

                Comments

                Comment on this article