When many people in a room are talking at the same time, the sounds of their voices mix with each other before ever arriving at our ears. Despite the fact that sorting out this sound mixture into individual voices is a profoundly difficult mathematical problem, our brain routinely accomplishes this task, and often with little apparent effort. The neural underpinnings of this nonetheless difficult task are not at all well understood. Furthermore, when this ability declines, e.g., due to hearing loss or aging, it is not known which specific mechanisms of the neural processing are the most critical in preserving remaining aspects of this ability. In order to address these issues, this proposed research uses magnetoencephalography (MEG) to record from the auditory cortex of behaving human subjects, specifically the temporally dynamic neural responses to individual sound elements and their mixtures. Linking the neural responses with their auditory stimuli and attentional state allows us to infer neural representations of these sounds. These neural representations are temporal: the neural processing unfolds in time in response to the ongoing acoustic dynamics. This research program will use these temporal representations to investigate how complex auditory scenes are neurally encoded, from the broad mixture of the entire acoustic scene to separated individual sources, in different areas of auditory cortex, and with a special emphasis on speech. Its overarching hypothesis is that auditory cortex employs a universal neural encoding scheme, genuinely temporal in nature, which underlies not only general auditory processing but also auditory scene segregation. The first specific aim will determine how auditory cortex neurally represents speech in difficult listening situations. One example is of speech in noise in a reverberant environment, a very relevant combination which can strongly undermine speech intelligibility. Another example is listening to a speaker in the presence of several competing speakers. In this case, understanding how the background (the mixture of the competing speakers) is neurally represented is of particular interest, and of direct relevance in determining how the brain segregates the foreground speech from the background. The second specific aim will determine analogs of these neural speech representations for dynamic non-speech sounds, especially when the sounds are separate components of a larger acoustic scene. This will generalize what is known about speech segregation to a wider class of sounds (while speech is very important for human listeners, most sounds are not speech). The third specific aim investigates the detailed neural mechanisms by which auditory cortex identifies and isolates individual speakers in a complex acoustic scene. Pitch and timbre, two acoustic cues known to be important for this task, are separately and independently modified, so that their individual contributions to the neural process of auditory scene segregation of speech may be determined.
Recent research has shown that a large class of speech and hearing impairments, including those arising from aging, from the need of cochlear implants, and from many other auditory disorders, are linked specifically to problems with the temporal processing of sounds in the brain. Temporal processing of sounds is especially critical for understanding speech, both under general conditions, and in noisy environments in particular (e.g., restaurants and other social settings). The proposed research identifies, measures, and quantifies the brain's temporal processing of sounds, especially in noisy environments. By using noninvasive neuroimaging tools, it also provides a clear path for future translational research.
Brodbeck, Christian; Hong, L Elliot; Simon, Jonathan Z (2018) Rapid Transformation from Auditory to Linguistic Representations of Continuous Speech. Curr Biol 28:3976-3983.e5 |
Brodbeck, Christian; Presacco, Alessandro; Simon, Jonathan Z (2018) Neural source dynamics of brain responses to continuous stimuli: Speech processing from acoustics to comprehension. Neuroimage 172:162-174 |
Puvvada, Krishna C; Summerfelt, Ann; Du, Xiaoming et al. (2018) Delta Vs Gamma Auditory Steady State Synchrony in Schizophrenia. Schizophr Bull 44:378-387 |
Vanthornhout, Jonas; Decruy, Lien; Wouters, Jan et al. (2018) Speech Intelligibility Predicted from Neural Entrainment of the Speech Envelope. J Assoc Res Otolaryngol 19:181-191 |
Puvvada, Krishna C; Simon, Jonathan Z (2017) Cortical Representations of Speech in a Multitalker Auditory Scene. J Neurosci 37:9189-9196 |
Cervantes Constantino, Francisco; Simon, Jonathan Z (2017) Dynamic cortical representations of perceptual filling-in for missing acoustic rhythm. Sci Rep 7:17536 |
Presacco, Alessandro; Simon, Jonathan Z; Anderson, Samira (2016) Evidence of degraded representation of speech in noise, in the aging midbrain and cortex. J Neurophysiol 116:2346-2355 |
Presacco, Alessandro; Simon, Jonathan Z; Anderson, Samira (2016) Effect of informational content of noise on speech representation in the aging midbrain and cortex. J Neurophysiol 116:2356-2367 |