+1 Recommend
0 collections
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The Role of Pitch and Timbre in Voice Gender Categorization

        1 , 2

      Frontiers in Psychology

      Frontiers Research Foundation

      audition, categorical perception, voice, mixture model

      Read this article at

          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.


          Voice gender perception can be thought of as a mixture of low-level perceptual feature extraction and higher-level cognitive processes. Although it seems apparent that voice gender perception would rely on low-level pitch analysis, many lines of research suggest that this is not the case. Indeed, voice gender perception has been shown to rely on timbre perception and to be categorical, i.e., to depend on accessing a gender model or representation. Here, we used a unique combination of acoustic stimulus manipulation and mathematical modeling of human categorization performances to determine the relative contribution of pitch and timbre to this process. Contrary to the idea that voice gender perception relies on timber only, we demonstrate that voice gender categorization can be performed using pitch only but more importantly that pitch is used only when timber information is ambiguous (i.e., for more androgynous voices).

          Related collections

          Most cited references 24

          • Record: found
          • Abstract: not found
          • Book: not found

          Methods of Multivariate Analysis

            • Record: found
            • Abstract: found
            • Article: not found

            Acoustic characteristics of American English vowels.

            The purpose of this study was to replicate and extend the classic study of vowel acoustics by Peterson and Barney (PB) [J. Acoust. Soc. Am. 24, 175-184 (1952)]. Recordings were made of 45 men, 48 women, and 46 children producing the vowels /i,I,e, epsilon,ae,a, [symbol: see text],O,U,u, lambda,3 iota/ in h-V-d syllables. Formant contours for F1-F4 were measured from LPC spectra using a custom interactive editing tool. For comparison with the PB data, formant patterns were sampled at a time that was judged by visual inspection to be maximally steady. Analysis of the formant data shows numerous differences between the present data and those of PB, both in terms of average frequencies of F1 and F2, and the degree of overlap among adjacent vowels. As with the original study, listening tests showed that the signals were nearly always identified as the vowel intended by the talker. Discriminant analysis showed that the vowels were more poorly separated than the PB data based on a static sample of the formant pattern. However, the vowels can be separated with a high degree of accuracy if duration and spectral change information is included.
              • Record: found
              • Abstract: found
              • Article: not found

              Thinking the voice: neural correlates of voice perception.

              The human voice is the carrier of speech, but also an "auditory face" that conveys important affective and identity information. Little is known about the neural bases of our abilities to perceive such paralinguistic information in voice. Results from recent neuroimaging studies suggest that the different types of vocal information could be processed in partially dissociated functional pathways, and support a neurocognitive model of voice perception largely similar to that proposed for face perception.

                Author and article information

                Front Psychol
                Front. Psychology
                Frontiers in Psychology
                Frontiers Research Foundation
                03 February 2012
                : 3
                1simpleBrain Research Imaging Centre, Scottish Imaging Network – A Platform for Scientific Excellence Collaboration, University of Edinburgh Edinburgh, UK
                2simpleCentre for Cognitive Neuroimaging, Institute of Neuroscience and Psychology, University of Glasgow Glasgow, UK
                Author notes

                Edited by: David J. Freedman, University of Chicago, USA

                Reviewed by: Ricardo Gil-da-Costa, Salk Institute, USA; Aaron Seitz, University of California Riverside, USA; Shaowen Bao, University of California-Berkeley, USA

                *Correspondence: Cyril R. Pernet, Division of Clinical Neurosciences, Brain Research Imaging Centre, University of Edinburgh, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, UK. e-mail: cyril.pernet@

                This article was submitted to Frontiers in Perception Science, a specialty of Frontiers in Psychology.

                Copyright © 2012 Pernet and Belin.

                This is an open-access article distributed under the terms of the Creative Commons Attribution Non Commercial License, which permits non-commercial use, distribution, and reproduction in other forums, provided the original authors and source are credited.

                Page count
                Figures: 8, Tables: 5, Equations: 1, References: 22, Pages: 11, Words: 6540
                Original Research

                Clinical Psychology & Psychiatry

                categorical perception, audition, voice, mixture model


                Comment on this article