18
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: not found

      Prediction across sensory modalities: A neurocomputational model of the McGurk effect.

      Read this article at

      ScienceOpenPublisherPubMed
      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          The McGurk effect is a textbook illustration of the automaticity with which the human brain integrates audio-visual speech. It shows that even incongruent audiovisual (AV) speech stimuli can be combined into percepts that correspond neither to the auditory nor to the visual input, but to a mix of both. Typically, when presented with, e.g., visual /aga/ and acoustic /aba/ we perceive an illusory /ada/. In the inverse situation, however, when acoustic /aga/ is paired with visual /aba/, we perceive a combination of both stimuli, i.e., /abga/ or /agba/. Here we assessed the role of dynamic cross-modal predictions in the outcome of AV speech integration using a computational model that processes continuous audiovisual speech sensory inputs in a predictive coding framework. The model involves three processing levels: sensory units, units that encode the dynamics of stimuli, and multimodal recognition/identity units. The model exhibits a dynamic prediction behavior because evidence about speech tokens can be asynchronous across sensory modality, allowing for updating the activity of the recognition units from one modality while sending top-down predictions to the other modality. We explored the model's response to congruent and incongruent AV stimuli and found that, in the two-dimensional feature space spanned by the speech second formant and lip aperture, fusion stimuli are located in the neighborhood of congruent /ada/, which therefore provides a valid match. Conversely, stimuli that lead to combination percepts do not have a unique valid neighbor. In that case, acoustic and visual cues are both highly salient and generate conflicting predictions in the other modality that cannot be fused, forcing the elaboration of a combinatorial solution. We propose that dynamic predictive mechanisms play a decisive role in the dichotomous perception of incongruent audiovisual inputs.

          Related collections

          Author and article information

          Journal
          Cortex
          Cortex; a journal devoted to the study of the nervous system and behavior
          Elsevier BV
          1973-8102
          0010-9452
          Jul 2015
          : 68
          Affiliations
          [1 ] Department of Basic Neurosciences, University of Geneva, Geneva, Switzerland. Electronic address: miren.olasagasti@unige.ch.
          [2 ] Department of Basic Neurosciences, University of Geneva, Geneva, Switzerland.
          Article
          S0010-9452(15)00132-X
          10.1016/j.cortex.2015.04.008
          26009260
          da77fe22-b142-4a8b-9e86-07af63ac495f
          History

          Computational modeling,Predictive coding,McGurk effect,Audiovisual integration

          Comments

          Comment on this article