The purpose of the proposed research is to develop capabilities for studying a novel hypothesis concerning human speech processing and to obtain pilot results in support of the hypothesis. The hypothesis challenges the dominant view in audiovisual (AV) speech perception research, which is that auditory and visual speech information converges easily into a common format during bottom-up sensory-perceptual processing. The hypothesis to be examined is that AV speech perception is a relatively late event for the speech perceiving brain, following unimodal processing and involving cortical networks of associations between auditory and visual speech representations. In order to investigate this hypothesis, results are needed on cortical locations and processing latencies of auditory and visual speech stimuli. The approach to test the hypothesis will use high-resolution temporal (event-related potentials-ERPs - obtained with electrophysiology) and spatial (functional magnetic resonance imaging-fMRI) data obtained from perceivers presented with auditory and visual speech stimuli. Participants will be presented with nonsense syllables under visual-only, auditory-only, and AV conditions. Matched and mismatched AV stimuli wilt be tested and the mismatched stimuli will be quantitatively graded in terms of their auditory-to-visual correspondence. Under the driving hypothesis for the research, cortical responses are predicted to vary as a function of correspondence; therefore, differential responses (their locations, temporal dynamics, and strengths) will be interpreted as the neural correlates of the binding (combining) process for auditory and visual speech. A main focus of the project will be to model the ERP data in terms of dipole source and current source densities, each type of analysis taking into account results from functional and anatomical MRI. This project will lead to a research program that will investigate in more detail the issue of whether convergence or association mechanisms can account for AV speech processing. This is a fundamental issue within the larger question of how the brain processes multisensory information. Accurate knowledge about the speech perceiving brain is needed for all clinical applications involving speech communication.
Jiang, Jintao; Bernstein, Lynne E (2011) Psychophysics of the McGurk and other audiovisual speech integration effects. J Exp Psychol Hum Percept Perform 37:1193-209 |
Ponton, Curtis W; Bernstein, Lynne E; Auer Jr, Edward T (2009) Mismatch negativity with visual-only and audiovisual speech. Brain Topogr 21:207-15 |
Bernstein, Lynne E; Auer Jr, Edward T; Wagner, Michael et al. (2008) Spatiotemporal dynamics of audiovisual speech processing. Neuroimage 39:423-35 |
Bernstein, Lynne E; Lu, Zhong-Lin; Jiang, Jintao (2008) Quantified acoustic-optical speech signal incongruity identifies cortical sites of audiovisual speech processing. Brain Res 1242:172-84 |
Jiang, Jintao; Auer Jr, Edward T; Alwan, Abeer et al. (2007) Similarity structure in visual speech perception and optical phonetic signals. Percept Psychophys 69:1070-83 |