Natural audiovisual speech encoding in the early stages of the human cortical hierarchy

Lalor, Edmund

Abstract

Speech is central to human life. Yet how the human brain processes speech in complex everyday situations remains poorly understood. One prominent idea is that speech perception is carried out using brain areas and mechanisms that are used for processing sounds more generally. And it has been suggested that these mechanisms become specialized for speech through learning, resulting in a speech processing network in the brain that processes increasingly complex aspects of the speech signal at successive hierarchical stages. But questions about the function of this hierarchy remain. In particular, while it is commonly acknowledged that seeing a speaker?s face in noisy environments can improve comprehension, our understanding of how visual speech influences the hierarchical processing of speech remain unclear. This is unfortunate as speech processing, and multisensory speech processing in particular, have been reported to be affected in a number of clinical disorders, including autism and schizophrenia. Thus, as well as contributing to our understanding of this most fundamental of human abilities, better knowledge of the neural mechanisms underpinning audiovisual speech processing could have important clinical research implications. One of the principal reasons for our lack of knowledge on the neurophysiology of audiovisual speech is the technical challenge associated with indexing the neural processing of natural speech with high temporal resolution and at multiple levels of the speech processing hierarchy. Non-human primates represent a less than perfect model for studying human speech processing, the hemodynamic changes underlying functional magnetic resonance imaging are too slow to track natural speech dynamics, and electrocorticography samples only a limited number of brain areas and cannot be broadly applied in clinical research. Recently, our group has introduced several new approaches for indexing natural speech processing using electroencephalography (EEG). These include entirely novel frameworks for producing dependent measures of the hierachical encoding of natural speech, and for quantifying multisensory integration of natural audiovisual speech. The present proposal seeks to exploit this opportunity to test the hypothesis that the integration of audio and visual speech is a flexible, multistage process that adapts to optimize comprehension based on the current listening conditions. Across three objectives the proposal aims to characterize this flexibility by determining how the hierarchical processing stage at which visual and audio speech are integrated varies as a function of 1) the listening environment, 2) the visual information available and 3) the deployment of attention. The work promises to bring a new depth of understanding to the perception of one of humanity?s most essential signals. And it will introduce several novel analyses and experimental paradigms that should be easily deployable in tackling research on clinical cohorts in which speech processing and/or multisensory integration is impaired.

Public Health Relevance

While it is well known that seeing a speaker?s face in a noisy environment can help you to understand what they are saying, our understanding of how the brain actually combines audio and visual speech information to achieve this is not well understood. A better understanding of the neural mechanisms involved in this process would be of great benefit as the, so called, multisensory integration of audio and visual speech has been reported to be specifically affected in developmental and psychiatric disorders, including autism and schizophrenia. This projects seeks to exploit several brand new approaches for studying natural audiovisual speech integration in the healthy human brain so as to gain greater insights into how brains so effortlessly combine speech information from vision and sound, with a view to informing future clinical research in several patient populations.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 1R01DC016297-01A1
Application #: 9518331
Study Section: Mechanisms of Sensory, Perceptual, and Cognitive Processes Study Section (SPC)
Program Officer: King, Kelly Anne

Project Start: 2018-03-15
Project End: 2023-02-28
Budget Start: 2018-03-15
Budget End: 2019-02-28
Support Year: 1
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: University of Rochester
Department: Biomedical Engineering
Type: School of Medicine & Dentistry
DUNS #: 041294109

City: Rochester
State: NY
Country: United States
Zip Code: 14627

Related projects


NIH 2021 R01 DC	Natural audiovisual speech encoding in the early stages of the human cortical hierarchy Lalor, Edmund / University of Rochester
NIH 2020 R01 DC	Natural audiovisual speech encoding in the early stages of the human cortical hierarchy Lalor, Edmund / University of Rochester
NIH 2019 R01 DC	Natural audiovisual speech encoding in the early stages of the human cortical hierarchy Lalor, Edmund / University of Rochester
NIH 2018 R01 DC	Natural audiovisual speech encoding in the early stages of the human cortical hierarchy Lalor, Edmund / University of Rochester

Comments

Be the first to comment on Edmund Lalor's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: