Music in Our Ears: The Biological Bases of Musical Timbre Perception

There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

Abstract

Timbre is the attribute of sound that allows humans and other animals to distinguish among different sound sources. Studies based on psychophysical judgments of musical timbre, ecological analyses of sound's physical characteristics as well as machine learning approaches have all suggested that timbre is a multifaceted attribute that invokes both spectral and temporal sound features. Here, we explored the neural underpinnings of musical timbre. We used a neuro-computational framework based on spectro-temporal receptive fields, recorded from over a thousand neurons in the mammalian primary auditory cortex as well as from simulated cortical neurons, augmented with a nonlinear classifier. The model was able to perform robust instrument classification irrespective of pitch and playing style, with an accuracy of 98.7%. Using the same front end, the model was also able to reproduce perceptual distance judgments between timbres as perceived by human listeners. The study demonstrates that joint spectro-temporal features, such as those observed in the mammalian primary auditory cortex, are critical to provide the rich-enough representation necessary to account for perceptual judgments of timbre by human listeners, as well as recognition of musical instruments.

Author Summary

Music is a complex acoustic experience that we often take for granted. Whether sitting at a symphony hall or enjoying a melody over earphones, we have no difficulty identifying the instruments playing, following various beats, or simply distinguishing a flute from an oboe. Our brains rely on a number of sound attributes to analyze the music in our ears. These attributes can be straightforward like loudness or quite complex like the identity of the instrument. A major contributor to our ability to recognize instruments is what is formally called ‘timbre’. Of all perceptual attributes of music, timbre remains the most mysterious and least amenable to a simple mathematical abstraction. In this work, we examine the neural underpinnings of musical timbre in an attempt to both define its perceptual space and explore the processes underlying timbre-based recognition. We propose a scheme based on responses observed at the level of mammalian primary auditory cortex and show that it can accurately predict sound source recognition and perceptual timbre judgments by human listeners. The analyses presented here strongly suggest that rich representations such as those observed in auditory cortex are critical in mediating timbre percepts.

Related collections

Most cited references 24

Record: found
Abstract: found
Article: not found

Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex.

Jonathan Fritz, Shihab Shamma, Mounya Elhilali … (2003)

We investigated the hypothesis that task performance can rapidly and adaptively reshape cortical receptive field properties in accord with specific task demands and salient sensory cues. We recorded neuronal responses in the primary auditory cortex of behaving ferrets that were trained to detect a target tone of any frequency. Cortical plasticity was quantified by measuring focal changes in each cell's spectrotemporal response field (STRF) in a series of passive and active behavioral conditions. STRF measurements were made simultaneously with task performance, providing multiple snapshots of the dynamic STRF during ongoing behavior. Attending to a specific target frequency during the detection task consistently induced localized facilitative changes in STRF shape, which were swift in onset. Such modulatory changes may enhance overall cortical responsiveness to the target tone and increase the likelihood of 'capturing' the attended target during the detection task. Some receptive field changes persisted for hours after the task was over and hence may contribute to long-term sensory memory.

0 comments Cited 229 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

"Who" is saying "what"? Brain-based decoding of human voice and speech.

Milene Bonte, Rainer Goebel, Elia Formisano … (2008)

Can we decipher speech content ("what" is being said) and speaker identity ("who" is saying it) from observations of brain activity of a listener? Here, we combine functional magnetic resonance imaging with a data-mining algorithm and retrieve what and whom a person is listening to from the neural fingerprints that speech and voice signals elicit in the listener's auditory cortex. These cortical fingerprints are spatially distributed and insensitive to acoustic variations of the input so as to permit the brain-based recognition of learned speech from unknown speakers and of learned voices from previously unheard utterances. Our findings unravel the detailed cortical layout and computational properties of the neural populations at the basis of human speech recognition and speaker identification.

0 comments Cited 126 times – based on 0 reviews      Review now

Bookmark

Record: found
Abstract: found
Article: not found

Cortical representation of natural complex sounds: effects of acoustic features and auditory object category.

Amber M. Leaver, Josef P. Rauschecker (2010)

How the brain processes complex sounds, like voices or musical instrument sounds, is currently not well understood. The features comprising the acoustic profiles of such sounds are thought to be represented by neurons responding to increasing degrees of complexity throughout auditory cortex, with complete auditory "objects" encoded by neurons (or small networks of neurons) in anterior superior temporal regions. Although specialized voice and speech-sound regions have been proposed, it is unclear how other types of complex natural sounds are processed within this object-processing pathway. Using functional magnetic resonance imaging, we sought to demonstrate spatially distinct patterns of category-selective activity in human auditory cortex, independent of semantic content and low-level acoustic features. Category-selective responses were identified in anterior superior temporal regions, consisting of clusters selective for musical instrument sounds and for human speech. An additional subregion was identified that was particularly selective for the acoustic-phonetic content of speech. In contrast, regions along the superior temporal plane closer to primary auditory cortex were not selective for stimulus category, responding instead to specific acoustic features embedded in natural sounds, such as spectral structure and temporal modulation. Our results support a hierarchical organization of the anteroventral auditory-processing stream, with the most anterior regions representing the complete acoustic signature of auditory objects.

0 comments Cited 116 times – based on 0 reviews      Review now

Bookmark

All references

Author and article information

Contributors

Frederic E. Theunissen: Role: Editor

Journal

Journal ID (nlm-ta): PLoS Comput Biol

Journal ID (iso-abbrev): PLoS Comput. Biol

Journal ID (publisher-id): plos

Journal ID (pmc): ploscomp

Title: PLoS Computational Biology

Publisher: Public Library of Science (San Francisco, USA )

ISSN (Print): 1553-734X

ISSN (Electronic): 1553-7358

Publication date Collection: November 2012

Publication date (Print): November 2012

Publication date (Electronic): 1 November 2012

Volume: 8

Issue: 11

Electronic Location Identifier: e1002759

Affiliations

[1 ]Department of Electrical and Computer Engineering, Center for Language and Speech Processing, Johns Hopkins University, Baltimore, Maryland, United States of America

[2 ]Laboratoire Psychologie de la Perception, CNRS-Université Paris Descartes & DEC, Ecole normale supérieure, Paris, France

[3 ]Department of Electrical and Computer Engineering and Institute for Systems Research, University of Maryland, College Park, Maryland, United States of America

University of California at Berkeley, United States of America

Author notes

* E-mail: mounya@ 123456jhu.edu

The authors have declared that no competing interests exist.

Conceived and designed the experiments: KP DP SS ME. Performed the experiments: KP DP SS ME. Analyzed the data: KP DP SS ME. Contributed reagents/materials/analysis tools: KP DP SS ME. Wrote the paper: KP DP SS ME.

Article

Publisher ID: PCOMPBIOL-D-12-00485

DOI: 10.1371/journal.pcbi.1002759

PMC ID: 3486808

PubMed ID: 23133363

SO-VID: 964586e9-5aed-4f93-bf13-e8f1caac36db

License:

This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

History

Date received : 23 March 2012

Date accepted : 12 September 2012

Page count

Pages: 16

Funding

This work was partly supported by grants from NSF CAREER IIS-0846112, AFOSR FA9550-09-1-0234, NIH 1R01AG036424-01 and ONR N000141010278. S. Shamma was partly supported by a Blaise-Pascal Chair, Région Ile de France, and by the program Research in Paris, Mairie de Paris. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Music in Our Ears: The Biological Bases of Musical Timbre Perception

Read this article at

Abstract

Author Summary

Related collections

Journal of Systems Thinking Preprints

Most cited references 24

Rapid task-related plasticity of spectrotemporal receptive fields in primary auditory cortex.

"Who" is saying "what"? Brain-based decoding of human voice and speech.

Cortical representation of natural complex sounds: effects of acoustic features and auditory object category.

Author and article information

Contributors

Journal

Affiliations

Author notes

Article

History

Page count

Funding

Categories

Comments

Comment on this article

Similar content 16

Cited by 30

Most referenced authors 159