8
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The quantification of gesture–speech synchrony: A tutorial and validation of multimodal data acquisition using device-based and video-based motion tracking

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          There is increasing evidence that hand gestures and speech synchronize their activity on multiple dimensions and timescales. For example, gesture’s kinematic peaks (e.g., maximum speed) are coupled with prosodic markers in speech. Such coupling operates on very short timescales at the level of syllables (200 ms), and therefore requires high-resolution measurement of gesture kinematics and speech acoustics. High-resolution speech analysis is common for gesture studies, given that field’s classic ties with (psycho)linguistics. However, the field has lagged behind in the objective study of gesture kinematics (e.g., as compared to research on instrumental action). Often kinematic peaks in gesture are measured by eye, where a “moment of maximum effort” is determined by several raters. In the present article, we provide a tutorial on more efficient methods to quantify the temporal properties of gesture kinematics, in which we focus on common challenges and possible solutions that come with the complexities of studying multimodal language. We further introduce and compare, using an actual gesture dataset (392 gesture events), the performance of two video-based motion-tracking methods (deep learning vs. pixel change) against a high-performance wired motion-tracking system (Polhemus Liberty). We show that the videography methods perform well in the temporal estimation of kinematic peaks, and thus provide a cheap alternative to expensive motion-tracking systems. We hope that the present article incites gesture researchers to embark on the widespread objective study of gesture kinematics and their relation to speech.

          Related collections

          Most cited references27

          • Record: found
          • Abstract: not found
          • Conference Proceedings: not found

          Deep Residual Learning for Image Recognition

            Bookmark
            • Record: found
            • Abstract: found
            • Article: not found

            DeepLabCut: markerless pose estimation of user-defined body parts with deep learning

            Quantifying behavior is crucial for many applications in neuroscience. Videography provides easy methods for the observation and recording of animal behavior in diverse settings, yet extracting particular aspects of a behavior for further analysis can be highly time consuming. In motor control studies, humans or other animals are often marked with reflective markers to assist with computer-based tracking, but markers are intrusive, and the number and location of the markers must be determined a priori. Here we present an efficient method for markerless pose estimation based on transfer learning with deep neural networks that achieves excellent results with minimal training data. We demonstrate the versatility of this framework by tracking various body parts in multiple species across a broad collection of behaviors. Remarkably, even when only a small number of frames are labeled (~200), the algorithm achieves excellent tracking performance on test frames that is comparable to human accuracy.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Coding gestural behavior with the NEUROGES--ELAN system.

              We present a coding system combined with an annotation tool for the analysis of gestural behavior. The NEUROGES coding system consists of three modules that progress from gesture kinetics to gesture function. Grounded on empirical neuropsychological and psychological studies, the theoretical assumption behind NEUROGES is that its main kinetic and functional movement categories are differentially associated with specific cognitive, emotional, and interactive functions. ELAN is a free, multimodal annotation tool for digital audio and video media. It supports multileveled transcription and complies with such standards as XML and Unicode. ELAN allows gesture categories to be stored with associated vocabularies that are reusable by means of template files. The combination of the NEUROGES coding system and the annotation tool ELAN creates an effective tool for empirical research on gestural behavior.
                Bookmark

                Author and article information

                Contributors
                wimpouw@uconn.edu
                Journal
                Behav Res Methods
                Behav Res Methods
                Behavior Research Methods
                Springer US (New York )
                1554-351X
                1554-3528
                28 October 2019
                28 October 2019
                2020
                : 52
                : 2
                : 723-740
                Affiliations
                [1 ]GRID grid.63054.34, ISNI 0000 0001 0860 4915, Center for the Ecological Study of Perception and Action, , University of Connecticut, ; Storrs, CT USA
                [2 ]GRID grid.6906.9, ISNI 0000000092621349, Department of Psychology, Education, & Child Studies, , Erasmus University Rotterdam, ; Rotterdam, The Netherlands
                [3 ]GRID grid.5590.9, ISNI 0000000122931605, Donders Institute for Brain, Cognition, and Behaviour, , Radboud University, ; Nijmegen, The Netherlands
                [4 ]GRID grid.5590.9, ISNI 0000000122931605, Centre for Language Studies, , Radboud University, ; Nijmegen, The Netherlands
                Author information
                http://orcid.org/0000-0003-2729-6502
                Article
                1271
                10.3758/s13428-019-01271-9
                7148275
                31659689
                cf282167-4f6f-4078-9be1-857123518cb8
                © The Author(s) 2019

                Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License ( http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

                History
                Funding
                Funded by: The Netherlands Organisation of Scientific Research
                Award ID: 446-16-012
                Categories
                Article
                Custom metadata
                © The Psychonomic Society, Inc. 2020

                Clinical Psychology & Psychiatry
                motion tracking,video recording,deep learning,gesture and speech analysis,multimodal language

                Comments

                Comment on this article