11
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      Speech Discrimination in Real-World Group Communication Using Audio-Motion Multimodal Sensing

      research-article

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          Speech discrimination that determines whether a participant is speaking at a given moment is essential in investigating human verbal communication. Specifically, in dynamic real-world situations where multiple people participate in, and form, groups in the same space, simultaneous speakers render speech discrimination that is solely based on audio sensing difficult. In this study, we focused on physical activity during speech, and hypothesized that combining audio and physical motion data acquired by wearable sensors can improve speech discrimination. Thus, utterance and physical activity data of students in a university participatory class were recorded, using smartphones worn around their neck. First, we tested the temporal relationship between manually identified utterances and physical motions and confirmed that physical activities in wide-frequency ranges co-occurred with utterances. Second, we trained and tested classifiers for each participant and found a higher performance with the audio-motion classifier (average accuracy 92.2%) than both the audio-only (80.4%) and motion-only (87.8%) classifiers. Finally, we tested inter-individual classification and obtained a higher performance with the audio-motion combined classifier (83.2%) than the audio-only (67.7%) and motion-only (71.9%) classifiers. These results show that audio-motion multimodal sensing using widely available smartphones can provide effective utterance discrimination in dynamic group communications.

          Related collections

          Most cited references29

          • Record: found
          • Abstract: not found
          • Article: not found

          Activity recognition using cell phone accelerometers

            Bookmark
            • Record: found
            • Abstract: not found
            • Article: not found

            The treatment of ties in ranking problems.

              Bookmark
              • Record: found
              • Abstract: not found
              • Article: not found

              A statistical model-based voice activity detection

                Bookmark

                Author and article information

                Journal
                Sensors (Basel)
                Sensors (Basel)
                sensors
                Sensors (Basel, Switzerland)
                MDPI
                1424-8220
                22 May 2020
                May 2020
                : 20
                : 10
                : 2948
                Affiliations
                [1 ]Research Institute for the Earth Inclusive Sensing, Tokyo Institute of Technology, Tokyo 152-8550, Japan
                [2 ]School of Computing, Tokyo Institute of Technology, Yokohama 226-8502, Japan; uchiyama.m.ac@ 123456m.titech.ac.jp (M.U.); honda.k.al@ 123456m.titech.ac.jp (K.H.)
                [3 ]Institute for Liberal Arts, Tokyo Institute of Technology, Tokyo 152-8550, Japan; tamio.nakano@ 123456me.com
                [4 ]Department of Computer Science, Tokyo Institute of Technology, Yokohama 226-8502, Japan; miyake@ 123456c.titech.ac.jp
                Author notes
                [* ]Correspondence: nozawa.t.ac@ 123456m.titech.ac.jp ; Tel.: +81-3-5734-3048
                Author information
                https://orcid.org/0000-0001-6300-4373
                Article
                sensors-20-02948
                10.3390/s20102948
                7287755
                32456031
                2c620563-830a-4af0-9591-811a0300d8a7
                © 2020 by the authors.

                Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license ( http://creativecommons.org/licenses/by/4.0/).

                History
                : 31 March 2020
                : 20 May 2020
                Categories
                Article

                Biomedical engineering
                speech discrimination,group communication,physical motion,multimodal sensing,sensor fusion,smartphone

                Comments

                Comment on this article