ScienceOpen: research and publishing network

For Researchers

Search
Advanced search

11

views

    

0

recommends

0

shares

Record: found
Abstract: found
Article: found

Is Open Access

Jointly Aligning and Predicting Continuous Emotion Annotations

Preprint

Author(s): Soheil Khorram , Melvin G McInnis , Emily Mower Provost

Publication date Created: 05 July 2019

Read this article at

ScienceOpen Publisher ArXiv

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Time-continuous dimensional descriptions of emotions (e.g., arousal, valence) allow researchers to characterize short-time changes and to capture long-term trends in emotion expression. However, continuous emotion labels are generally not synchronized with the input speech signal due to delays caused by reaction-time, which is inherent in human evaluations. To deal with this challenge, we introduce a new convolutional neural network (multi-delay sinc network) that is able to simultaneously align and predict labels in an end-to-end manner. The proposed network is a stack of convolutional layers followed by an aligner network that aligns the speech signal and emotion labels. This network is implemented using a new convolutional layer that we introduce, the delayed sinc layer. It is a time-shifted low-pass (sinc) filter that uses a gradient-based algorithm to learn a single delay. Multiple delayed sinc layers can be used to compensate for a non-stationary delay that is a function of the acoustic space. We test the efficacy of this system on two common emotion datasets, RECOLA and SEWA, and show that this approach obtains state-of-the-art speech-only results by learning time-varying delays while predicting dimensional descriptors of emotions.

Related collections

Most cited references 34

Record: found
Abstract: not found
Article: not found

Emotion recognition in human-computer interaction

M R Cowie, E. Douglas-Cowie, N. Tsapatsoulis … (2001)

0 comments Cited 250 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Article: not found

Continuous Prediction of Spontaneous Affect from Multiple Cues and Modalities in Valence-Arousal Space

H Günes, M Pantic, M. Nicolaou (2011)

0 comments Cited 74 times – based on 0 reviews      Review now

Record: found
Abstract: not found
Conference Proceedings: not found

Adieu features? End-to-end speech emotion recognition using a deep convolutional recurrent network

Stefanos Zafeiriou, George Trigeorgis, Fabien Ringeval … (2016)

0 comments Cited 67 times – based on 0 reviews

Author and article information

Journal

Publication date Created: 05 July 2019

Article

DOI: 10.1109/TAFFC.2019.2917047

ArXiV ID: 1907.03050

SO-VID: 24fe3012-85a7-4002-9934-105f8927f15f

License:

http://arxiv.org/licenses/nonexclusive-distrib/1.0/

History

Custom metadata

Comments IEEE Transactions on Affective Computing

Categories cs.LG cs.HC eess.AS stat.ML

ScienceOpen disciplines: Machine learning,Artificial intelligence,Electrical engineering,Human-computer-interaction

Data availability:

ScienceOpen disciplines: Machine learning, Artificial intelligence, Electrical engineering, Human-computer-interaction

Comments

Comment on this article