42
views
0
recommends
+1 Recommend
0 collections
    0
    shares
      • Record: found
      • Abstract: found
      • Article: found
      Is Open Access

      The hemispheric lateralization of speech processing depends on what “speech” is: a hierarchical perspective

      editorial
       
      Frontiers in Human Neuroscience
      Frontiers Media S.A.

      Read this article at

      Bookmark
          There is no author summary for this article yet. Authors can add summaries to their articles on ScienceOpen to make them more accessible to a non-specialist audience.

          Abstract

          A recurring question in neuroimaging studies of spoken language is whether speech is processed largely bilaterally, or whether the left hemisphere plays a more dominant role (cf., Hickok and Poeppel, 2007; Rauschecker and Scott, 2009). Although questions regarding underlying mechanisms are certainly of interest, the discussion unfortunately gets sidetracked due to the imprecise use of the word “speech”: by being more explicit about the type of cognitive and linguistic processing to which we are referring it may be possible to reconcile many of the disagreements present in the literature. Levels of processing during connected speech comprehension A relatively uncontroversial starting point is to acknowledge that understanding a spoken sentence requires a listener to analyze a complex acoustic signal along a number of levels, listed schematically in Figure 1. Phonemes must be distinguished, words identified, and grammatical structure taken into account so that meaning can be extracted. These processes operate in an interactive parallel fashion, and as such are difficult to fully disentangle. Such interdependence also means that as researchers we often use “speech” as a term of convenience to mean: Amplitude-modulated noise or spectral transitions, as might be similar to aspects of spoken language; Phonemes (“b”), syllables (“ba”), or pseudowords (“bab”); Words (“bag”); Phrases (“the bag”); Sentences (“The bag of carrots fell to the floor”) or narratives. Figure 1 The cortical regions involved in processing spoken language depend in a graded fashion on the level of acoustic and linguistic processing required. Processing related to amplitude modulated noise is bilateral (e.g., Giraud et al., 2000), shown at top. However, as the requirements for linguistic analysis and integration increase, neural processing shows a concomitant increase in its reliance on left hemisphere regions for words [see meta-analysis in Davis and Gaskell (2009)] and sentences [see meta-analysis in Adank (2012)]. Naturally, because different types of spoken language require different cognitive mechanisms—spanning sublexical, lexical, and supralexical units—using an unqualified term such as “speech” can lead to confusion about the processes being discussed. Although this point might seem obvious, a quick review of the speech literature demonstrates that many authors 1 have at one time or another assumed their definition of “speech” was obvious enough that they need not give it, leaving readers to form their own opinions. Below I will briefly review literature in relation to the neural bases for two types of spoken language processing: unconnected speech (isolated phonemes and single words) and connected speech (sentences or narratives). The goal is to illustrate that, within the context of a hierarchical neuroanatomical framework, there are aspects of “speech” processing that are both bilateral and lateralized. Unconnected speech is processed largely bilaterally in temporal cortex The first cortical way station for acoustic input to the brain is primary auditory cortex: not surprisingly, acoustic stimuli activate this region robustly in both hemispheres, whether they consist of pure tones (Belin et al., 1999; Binder et al., 2000) or amplitude-modulated noise (Giraud et al., 2000; Hart et al., 2003; Overath et al., 2012). Although there is speculation regarding hemispheric differences in specialization for these low level signals (Poeppel, 2003; Giraud et al., 2007; Obleser et al., 2008; McGettigan and Scott, 2012), for the current discussion, it is sufficient to note that both left and right auditory cortices respond robustly to most auditory stimuli, and that proposed differences in hemispheric preference relate to a modulation of this overall effect 2 . Beyond low-level acoustic stimulation, phonemic processing requires both an appropriate amount of spectral detail and the relationship to a pre-existing acoustic category (i.e., the phoneme). The processing of isolated syllables results in activity along the superior temporal sulcus and middle temporal gyrus, typically on the left but not the right (Liebenthal et al., 2005; Heinrich et al., 2008; Agnew et al., 2011; DeWitt and Rauschecker, 2012). Although this may suggest a left hemisphere specialization for phonemes, listening to words (which, of course, include phonemes) reliably shows strong activity in bilateral middle and superior temporal gyrus (Price et al., 1992; Binder et al., 2000, 2008). In addition, stroke patients with damage to left temporal cortex are generally able to perform reasonably well on word-to-picture matching tasks (Gainotti et al., 1982); the same is true of healthy controls undergoing a Wada procedure (Hickok et al., 2008). Together these findings suggest that the right hemisphere is able to support at least some degree of phonemic and lexical processing. That being said, there are also regions that show increased activity for words in the left hemisphere but not the right, particularly when pseudowords are used as a baseline (Davis and Gaskell, 2009). Both pseudowords and real words rely on stored representations of speech sounds (they share phonemes), but real words also involve consolidated lexical and/or conceptual information (Gagnepain et al., 2012). Left-hemisphere activations likely reflect the contribution of lexical and semantic memory processes that are accessed in an obligatory manner during spoken word recognition. Within the framework outlined in Figure 1, spoken words thus lie between very low level auditory processing (which is essentially bilateral) and the processing of sentences and narratives (which, as I will discuss below, is more strongly left lateralized). Processing of phonemes and single words therefore appears to be mediated in large part by both left and right temporal cortex, although some indications of lateralization may be apparent. Connected speech relies on a left-lateralized frontotemporal network In addition to recognizing single words, comprehending connected speech—such as meaningful sentences—depends on integrative processes that help determine the syntactic and semantic relationship between words. These processes rely not only on phonemic and lexical information, but also on prosodic and rhythmic cues conveyed over the course of several seconds. In other words, a sentence is not simply a string of phoneme-containing items, but conveys a larger meaning through its organization (Vandenberghe et al., 2002; Humphries et al., 2006; Lerner et al., 2011; Peelle and Davis, 2012). In addition to providing content in and of itself, the syntactic, semantic, and rhythmic structure present in connected speech also supports listeners' predictions of upcoming acoustic information. An early and influential PET study of connected speech by Scott et al. showed increased activity in the lateral aspect of left anterior temporal cortex for spoken sentences relative to unintelligible spectrally-rotated versions of these sentences (Scott et al., 2000). Subsequent studies, due in part to the use of a greater number of participants, have typically found intelligibility effects bilaterally, often along much of the length of superior temporal cortex (Crinion et al., 2003; Friederici et al., 2010; Wild et al., 2012a). In addition, a large and growing number of neuroimaging experiments show left inferior frontal involvement for intelligible sentences, either compared to an unintelligible control condition (Rodd et al., 2005, 2010; Awad et al., 2007; Obleser et al., 2007; Okada et al., 2010; Peelle et al., 2010a; McGettigan et al., 2012; Wild et al., 2012b) or parametrically correlating with intelligibility level (Davis and Johnsrude, 2003; Obleser and Kotz, 2010; Davis et al., 2011). Regions of left inferior frontal cortex are also involved in processing syntactically complex speech (Peelle et al., 2010b; Tyler et al., 2010; Obleser et al., 2011) and in resolving semantic ambiguity (Rodd et al., 2005, 2010, 2012; Snijders et al., 2010). In most of these studies activity in right inferior frontal cortex is not significant, or is noticeably smaller in extent than activity in the left hemisphere. These functional imaging studies are consistent with patient work demonstrating that participants with damage to left inferior frontal cortex have difficulty with sentence processing (e.g., Grossman et al., 2005; Peelle et al., 2007; Papoutsi et al., 2011; Tyler et al., 2011). Processing connected speech thus relies more heavily on left hemisphere language regions, most obviously in inferior frontal cortex. The evidence outlined above suggests this is largely due to the increased linguistic demands associated with sentence processing compared to single words. The importance of statistical comparisons for inferences regarding laterality In many of the above papers (and in my interpretation of them) laterality was not statistically assessed, but inferred based on the presence or absence of an activation cluster in a particular brain region. That is, seeing a cluster of activation in left inferior frontal gyrus but not the right, and concluding that this particular task has a “left lateralized” pattern of neural activity. However, simply observing a response in one region, but not another, does not mean that these regions significantly differ in their activity (the “imager's fallacy”; Henson, 2005). This is a well-known statistical principle, but one that can remain difficult to follow in the face of compelling graphical depictions of data (Nieuwenhuis et al., 2011). Nevertheless, for true claims of differential hemispheric contributions to speech processing, the left and right hemisphere responses need to be directly compared. Unfortunately, for functional imaging studies hemispheric comparisons are not as straightforward as they seem, in part because our left and right hemispheres are not mirror images of each other. There are, however, a number of reasonable ways to approach this challenge, including: Extracting data from regions of interest (ROIs), including independently defined functional regions (Kriegeskorte et al., 2009) or probabilistic cytoarchitecture (Eickhoff et al., 2005), and averaging over voxels to compare left and right hemisphere responses. Sometimes these ROIs end up being large, which does not always support the specific hypotheses being tested, and not all regions may be available. However, this approach is relatively straightforward to implement and interpret. Using a custom symmetric brain template for spatial normalization (Bozic et al., 2010). This may result in less veridical spatial registration, but enables voxel-by-voxel statistical tests of laterality by flipping images around the Y axis, avoiding the problem of ROI selection (and averaging). Comparing left vs. right hemisphere responses using a multivariate classification approach (McGettigan et al., 2012). Multivariate approaches are robust to large ROIs, as their performance is typically driven by a smaller (more informative) subset of all voxels studied. Multivariate approaches may be somewhat more challenging to implement, however, and (depending on the size of the ROI used) may limit spatial specificity. In the absence of these or similar statistical comparisons, any statements about lateralization of processing need to be made (and taken) lightly. Conclusions I have not intended to make any novel claims about the neural organization of speech processing, merely to clarify what has already been shown: phonological and lexical information is processed largely bilaterally in temporal cortex, whereas connected speech relies on a left-hemisphere pathway that includes left inferior frontal gyrus. Importantly, the distinction between unconnected and connected speech is not dichotomous, but follows a gradient of laterality depending on the cognitive processes required: lateralization emerges largely as a result of increased linguistic processing. So, is speech processed primarily bilaterally, or along a left-dominant pathway? It depends on what sort of “speech” we are talking about, and being more specific in our characterizations will do much to advance the discussion. Of more interest will be future studies that continue to identify the constellation of cognitive processes supported by these neuroanatomical networks.

          Related collections

          Most cited references37

          • Record: found
          • Abstract: found
          • Article: not found

          Topographic mapping of a hierarchy of temporal receptive windows using a narrated story.

          Real-life activities, such as watching a movie or engaging in conversation, unfold over many minutes. In the course of such activities, the brain has to integrate information over multiple time scales. We recently proposed that the brain uses similar strategies for integrating information across space and over time. Drawing a parallel with spatial receptive fields, we defined the temporal receptive window (TRW) of a cortical microcircuit as the length of time before a response during which sensory information may affect that response. Our previous findings in the visual system are consistent with the hypothesis that TRWs become larger when moving from low-level sensory to high-level perceptual and cognitive areas. In this study, we mapped TRWs in auditory and language areas by measuring fMRI activity in subjects listening to a real-life story scrambled at the time scales of words, sentences, and paragraphs. Our results revealed a hierarchical topography of TRWs. In early auditory cortices (A1+), brain responses were driven mainly by the momentary incoming input and were similarly reliable across all scrambling conditions. In areas with an intermediate TRW, coherent information at the sentence time scale or longer was necessary to evoke reliable responses. At the apex of the TRW hierarchy, we found parietal and frontal areas that responded reliably only when intact paragraphs were heard in a meaningful sequence. These results suggest that the time scale of processing is a functional property that may provide a general organizing principle for the human cerebral cortex.
            Bookmark
            • Record: found
            • Abstract: found
            • Article: found
            Is Open Access

            A complementary systems account of word learning: neural and behavioural evidence

            In this paper we present a novel theory of the cognitive and neural processes by which adults learn new spoken words. This proposal builds on neurocomputational accounts of lexical processing and spoken word recognition and complementary learning systems (CLS) models of memory. We review evidence from behavioural studies of word learning that, consistent with the CLS account, show two stages of lexical acquisition: rapid initial familiarization followed by slow lexical consolidation. These stages map broadly onto two systems involved in different aspects of word learning: (i) rapid, initial acquisition supported by medial temporal and hippocampal learning, (ii) slower neocortical learning achieved by offline consolidation of previously acquired information. We review behavioural and neuroscientific evidence consistent with this account, including a meta-analysis of PET and functional Magnetic Resonance Imaging (fMRI) studies that contrast responses to spoken words and pseudowords. From this meta-analysis we derive predictions for the location and direction of cortical response changes following familiarization with pseudowords. This allows us to assess evidence for learning-induced changes that convert pseudoword responses into real word responses. Results provide unique support for the CLS account since hippocampal responses change during initial learning, whereas cortical responses to pseudowords only become word-like if overnight consolidation follows initial learning.
              Bookmark
              • Record: found
              • Abstract: found
              • Article: not found

              Functional integration across brain regions improves speech perception under adverse listening conditions.

              Speech perception is supported by both acoustic signal decomposition and semantic context. This study, using event-related functional magnetic resonance imaging, investigated the neural basis of this interaction with two speech manipulations, one acoustic (spectral degradation) and the other cognitive (semantic predictability). High compared with low predictability resulted in the greatest improvement in comprehension at an intermediate level of degradation, and this was associated with increased activity in the left angular gyrus, the medial and left lateral prefrontal cortices, and the posterior cingulate gyrus. Functional connectivity between these regions was also increased, particularly with respect to the left angular gyrus. In contrast, activity in both superior temporal sulci and the left inferior frontal gyrus correlated with the amount of spectral detail in the speech signal, regardless of predictability. These results demonstrate that increasing functional connectivity between high-order cortical areas, remote from the auditory cortex, facilitates speech comprehension when the clarity of speech is reduced.
                Bookmark

                Author and article information

                Journal
                Front Hum Neurosci
                Front Hum Neurosci
                Front. Hum. Neurosci.
                Frontiers in Human Neuroscience
                Frontiers Media S.A.
                1662-5161
                16 November 2012
                2012
                : 6
                : 309
                Affiliations
                Department of Otolaryngology, Washington University in St. Louis St. Louis, MO, USA
                Author notes

                Edited by: Russell A. Poldrack, University of Texas, USA

                Reviewed by: Russell A. Poldrack, University of Texas, USA

                Article
                10.3389/fnhum.2012.00309
                3499798
                23162455
                15039087-9959-4709-8ed2-15e2d60b95ba
                Copyright © 2012 Peelle.

                This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in other forums, provided the original authors and source are credited and subject to any copyright notices concerning any third-party graphics etc.

                History
                : 09 October 2012
                : 25 October 2012
                Page count
                Figures: 1, Tables: 0, Equations: 0, References: 55, Pages: 4, Words: 3540
                Categories
                Neuroscience
                Opinion Article

                Neurosciences
                Neurosciences

                Comments

                Comment on this article