Introduction A large number of invasive and non-invasive neurophysiological studies provide converging evidence that cortical oscillations play an important role in gating information flow in the human brain, thereby supporting a variety of cognitive processes including attention, working memory, and decision-making –. These oscillations can be hierarchically organised. For example, the phase of (4–8) Hz theta oscillations can modulate the amplitude of (30–90 Hz) gamma oscillations; the phase of (1–2 Hz) delta oscillations can modulate the amplitude of theta oscillations –. Interestingly, speech comprises a remarkably similar hierarchy of rhythmic components representing prosody (delta band), syllables (theta band), and phonemes (gamma band) –. The similarity in the hierarchical organisation of cortical oscillations and the rhythmic components of speech suggests that cortical oscillations at different frequencies might sample auditory speech input at different rates. Cortical oscillations could therefore represent an ideal medium for multiplexed segmentation and coding of speech ,–. The hierarchical coupling of oscillations (with fast oscillations nested in slow oscillations) could be used to multiplex complementary information over multiple time scales  (see also ) for example by separately encoding fast (e.g., phonemic) and slower (e.g., syllabic) information and their temporal relationships. Previous studies have demonstrated amplitude and phase modulation in response to speech stimuli in the delta, theta, and gamma bands using electroencephalography (EEG)/magnetoencephalography (MEG) ,,– and electrocorticography (ECOG) –. These findings support an emerging view that speech stimuli induce low-frequency phase patterns in auditory areas that code input information. Interestingly, these phase patterns seem to be under attentional control. For example, in the well known cocktail party situation, they code mainly for the attended stimulus ,,. Thus, brain oscillations have become obvious candidates for segmenting and parsing continuous speech because they reflect rhythmic changes in excitability . This attractive model leaves three important points largely unresolved: First, a comprehensive account of how rhythmic components in speech interact with brain oscillations is still missing and it is uncertain if the previously reported hemispheric asymmetry during speech perception is also evident in a lateralized alignment of brain oscillations to continuous speech. Behavioural, electrophysiological, and neuroimaging studies ,,,, suggest that there is a relatively long integration window (100–300 ms, corresponding to the theta band) in the right auditory cortex and a relatively short integration window (20–40 ms, corresponding to the gamma band) in the left auditory cortex . But it is unclear whether this differentiation is relevant for oscillatory tracking of speech. Second, it is unknown whether cortical brain oscillations are hierarchically coupled during perception of continuous speech. This is of particular interest because hierarchically coupled brain oscillations could represent hierarchically organised speech components (prosody, syllables, phonemes) at different temporal scales. Third, it is unclear how oscillatory speech tracking dynamically adapts to arrhythmic components in speech. If brain oscillations implement a universal mechanism for speech processing they should also account for variations or breaks in speech rhythmicity, so that the phase of low-frequency oscillations aligns to (quasi-periodic) salient speech events for optimal processing. Here, we addressed these three points using continuous speech and analysis based on information theory. Importantly, all three points were investigated for intelligible and unintelligible (backward played) speech. We analysed the frequency-specific dependencies between the speech envelope and brain activity. We also analysed the dependencies between cortical oscillations across different frequencies. We first hypothesised that a multi-scale hierarchy of oscillations in the listener's brain tracks the dynamics of the speaker's speech envelope—specifically, preferential theta band tracking in the right auditory cortex and gamma band tracking in the left auditory cortex. Second, we asked whether speech-entrained brain oscillations are hierarchically coupled and if so how that coupling is modulated by the stimulus. Third, we asked whether phase of low-frequency brain oscillations (likely indicating rhythmic variations in neural excitability) in the auditory cortex coincide with and adapt to salient events in speech stimuli. We presented a 7-min long continuous story binaurally to 22 participants while recording neural activity with MEG (“story” condition). As a control condition the same story was played backwards (“back” condition). We used mutual information (MI) to measure all dependencies (linear and nonlinear) between the speech signal and its encoding in brain oscillations ,. We did so in all brain voxels for frequencies from 1 to 60 Hz and for important interactions (phase-phase, amplitude-amplitude, cross-frequency phase-amplitude, and cross-frequency amplitude-phase, see Figure 1 and Materials and Methods). This resulted in frequency specific functional brain maps of dependencies between the speech envelope and brain activity. Similar analysis was performed to study dependencies between brain oscillations within cortical areas but across different frequency bands. 10.1371/journal.pbio.1001752.g001 Figure 1 Mutual information analysis. The broadband amplitude envelope is computed for the speech signal. For each frequency band speech envelope and MEG signals are bandpass filtered and activation time series are computed for each voxel in the brain. Phase and amplitude time series are computed from the Hilbert transform for speech and voxel time series and subjected to MI analysis. MI is computed between speech signal and time series for each voxel leading to a tomographic map of MI. Group statistical analysis is performed on these maps across all 22 participants. Our results reveal hierarchically coupled oscillations in speech-related brain areas and their alignment to quasi-rhythmic components in continuous speech (prosody, syllables, phonemes), with pronounced asymmetries between left and right hemispheres. Edges in the speech envelope reset oscillatory low-frequency phase in left and right auditory cortices. Phase resets in cortical oscillations code features of the speech edges and help to align temporal windows of high neural excitability to optimise processing of important speech events. Importantly, we demonstrate that oscillatory speech tracking and hierarchical couplings significantly reduce for backward-presented speech and so are not only stimulus driven. Results Oscillatory Speech Tracking Relies on Two Mechanisms We first asked whether there is phase-locking between rhythmic changes in the speech envelope and corresponding oscillatory brain activity. Whereas most previous studies quantify phase-locking to stimulus onset across repeated presentations of the same stimulus, here we studied phase-locking over time directly between speech envelope and brain oscillations. To do this, we compared the phase coupling between the speech and oscillatory brain activity (in 1 Hz steps between 1 and 60 Hz) in two conditions: story and back. Figure 2 summarizes the results. First, MI revealed a significantly stronger phase coupling between the speech envelope and brain oscillations in the story compared to back conditions in the left and right auditory cortex in delta (1–3 Hz) and theta (3–7 Hz) frequency bands (group statistics, p 0.05). Time-locked to these onsets we have extracted trials from −500 ms to 1,000 ms. PLV analysis PLVs  were computed in three ways. First, as phase-locking of auditory theta activity across trials (PLV = 1/n|∑ exp(i * ph)| where n is the number of trials and ph the phase of auditory theta signal). Second, the phase-locking of the phase difference between auditory theta signal and the theta speech envelope was computed (PLVsp = 1/n |∑ (exp(i * (ph−phs))| where n is the number of trials and ph the phase of auditory theta signal and phs the theta phase of speech envelope). Third, the phase-locking between left and right auditory theta activity (PLVsp = 1/n |∑ (exp(i * (phl−phr))| where n is the number of trials and phl and phr the phase of left and right auditory theta signal, respectively). Time-resolved PLV data were averaged in three time windows (−200 ms to 0 ms, 100–300 ms, 400–600 ms) and subjected to Anova analysis with factors time window and PLV measure. Both factors and their interactions were highly significant (time window: F = 39.77, p<0.001; PLV measure: F = 50.11, p<0.001; interaction: F = 14.86, p<0.001). Speech sampling For each voxel the instantaneous amplitude A and phase ph for each speech trial was computed (Figure 6). For each trial the cross-correlation of either cos(ph) or A with the speech envelope was computed over the time range 0–500 ms following onset with a maximum lag of 150 ms. The maximum correlation across lags was averaged across trials. As control the same computation was repeated with a random shuffling of trial order for the speech data (to destroy the correspondence between trials for speech and brain data). Cross-frequency analysis We performed two separate analyses to investigate the spatio-spectral distribution of cross-frequency coupling (Figure 7). First, we computed cross-frequency coupling between theta phase and 40 Hz gamma amplitude in all brain voxels. Second, we computed the full cross-frequency coupling matrix separately for the left and right auditory cortex. The first analysis was motivated by Figure 2C that demonstrates coupling between speech theta phase and auditory 40 Hz amplitude dynamics and by Figure 4 that shows theta phase to gamma amplitude coupling in the auditory cortex. Analysis of cross-frequency coupling was performed by computing MI as in Figure 2C (but without using the speech signal). For each brain voxel MI between theta phase and gamma amplitude was computed for the two 500 ms windows preceding and following speech onset across all 254 trials. t-values of contrast post-onset versus pre-onset were computed across trials. The computation was performed for the story and back condition. As in Figure 2 individual maps were subjected to dependent samples t-test with randomisation-based FDR correction. Group t-maps are displayed with thresholds corresponding to p<0.05 (FDR corrected). The second analysis was performed only in the left and right auditory cortex. Here, we computed MI as before but now for all combinations of phase (range 1–10 Hz) and amplitude (range 4–80 Hz). Group t-statistic was computed for the difference between story condition and surrogate data (surrogate data were the same as story condition but each amplitude signal was matched with phase signal from a random trial). For each frequency-frequency pair we computed a bootstrap confidence level by randomly drawing 22 participants with replacement in each of 500 bootstrap iterations and computing the 95th percentile. The lateralisation analysis in Figure 7C follows the same approach as for Figure 7B and compares cross-frequency coupling for the story condition between the left and right auditory cortex. Supporting Information Figure S1 (A) Mutual information group statistics for surrogate data. Group statistical map of phase-phase MI dependencies in the theta frequency band. This figure corresponds to Figure 2B but here the back condition has been replaced with a surrogate condition consisting of the MEG data from the story condition and the reversed speech envelope from the story condition to estimate dependencies that could be expected by chance. (B) Phase-locking group statistics. This figure corresponds to Figure 2 but instead of MI PLV has been used to quantify the dependence between phase of low-frequency speech envelope and brain activity in the delta band. (C) Same as (B) but for theta frequency band. (PDF) Click here for additional data file. Figure S2 Bar plot of individual lateralisation indices. For each participant the lateralisation index for theta-phase lateralisation (red) and theta-gamma lateralisation (blue) in Heschl's gyrus (left panel) and superior temporal gyrus (STG, right panel) is shown. Each pair of red/blue bars corresponds to an individual. (PDF) Click here for additional data file. Figure S3 Bar plot of mutual information in the auditory cortex. For each panel mean and SEM is shown for the left and right auditory cortex for all conditions. An asterisk indicates relevant significant differences (t-test with p<0.05). Control condition is computed from surrogate data where brain activity from story condition is used together with speech envelope from back condition. (A) Bar plot for delta phase. (B) Bar plot for theta phase. (C) Bar plot for mutual information between theta phase in speech and gamma amplitude in the auditory cortex. (D) Bar plot for mutual information between theta phase and gamma amplitude in the auditory cortex. Here, control condition was obtained from mutual information with gamma time series reversed. (PDF) Click here for additional data file. Figure S4 Group statistics of cross-frequency coupling. (A) Statistical map of difference between story and back condition for mutual information between delta phase and theta amplitude. (B) Statistical map of lateralisation of mutual information between delta phase and theta amplitude for the story condition. (C) Statistical map of difference between story and back condition for mutual information between theta phase and gamma amplitude. This map corresponds to Figure 4A but is computed using a different method for quantifying cross-frequency coupling . (PDF) Click here for additional data file. Figure S5 Phase coding of speech amplitude. The phase of theta oscillations at 100 ms after speech onset in the left (black) and right (red) auditory cortex codes the maximum amplitude of speech envelope in the first 200 ms following onset. The area signifies the 95% confidence interval around the median obtained from bootstrap analysis. (PDF) Click here for additional data file.