Perceiving and following an individual speaker in a crowded, noisy environment is a commonplace task for listeners with normal hearing. The underlying neurophysiology, however, is complex, and the task remains a struggle for people with peripheral and central auditory pathway disorders. The lack of a detailed neurobiological model of mechanisms and functions underlying robust speech perception has hindered our understanding of how these processes become impaired in the suffering population. In our innovative approach, we will record from high-density micro and macro electrode arrays surgically implanted on the superior temporal gyrus of epilepsy patients as part of their clinical evaluation. This method offers an exceptionally detailed perspective of cortical population activity. We will build upon two recent complementary findings where we identified a highly selective, spatially distributed neural representation of phonetic features (Mesgarani et. al. Science, 2014), which at the same time is highly dynamic and can change rapidly to reflect the perceptual bias of the listener (Mesgarani & Chang, Nature 2012). While significant, these studies revealed several gaps in our understanding of this process, which we intend to address in this proposal. Specifically, we will resolve the following unanswered questions: 1) what is the neural mechanism for joint encoding of both phonetic and speaker features? 2) How does attention modulate phonetic and speaker feature selectivity of neural responses? And 3) what computational mechanisms can account for dynamic feature selectivity of responses in STG? Answering these questions will significantly advance our understanding of a remarkable human ability, and will be of great interest to researchers from many areas including neurologists, and sensory and cognitive neuroscientists.

Public Health Relevance

Understanding the mechanisms underlying speech perception in challenging environments is a crucial step in determining how these processes deteriorate in various disorders of peripheral and central auditory pathways. Our studies will result in novel neurobiological models of robust speech perception that will serve as a necessary step toward designing innovative therapeutic measures.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
5R01DC014279-02
Application #
9024503
Study Section
Mechanisms of Sensory, Perceptual, and Cognitive Processes Study Section (SPC)
Program Officer
Shekim, Lana O
Project Start
2015-03-01
Project End
2020-02-29
Budget Start
2016-03-01
Budget End
2017-02-28
Support Year
2
Fiscal Year
2016
Total Cost
Indirect Cost
Name
Columbia University (N.Y.)
Department
Engineering (All Types)
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
049179401
City
New York
State
NY
Country
United States
Zip Code
10027
Khalighinejad, Bahar; Nagamine, Tasha; Mehta, Ashesh et al. (2017) NAPLIB: AN OPEN SOURCE TOOLBOX FOR REAL-TIME AND OFFLINE NEURAL ACOUSTIC PROCESSING. Proc IEEE Int Conf Acoust Speech Signal Process 2017:846-850
O'Sullivan, James; Chen, Zhuo; Herrero, Jose et al. (2017) Neural decoding of attentional selection in multi-speaker environments without access to clean sources. J Neural Eng 14:056001
Khalighinejad, Bahar; Cruzatto da Silva, Guilherme; Mesgarani, Nima (2017) Dynamic Encoding of Acoustic Features in Neural Responses to Continuous Speech. J Neurosci 37:2176-2185
Chen, Zhuo; Luo, Yi; Mesgarani, Nima (2017) DEEP ATTRACTOR NETWORK FOR SINGLE-MICROPHONE SPEAKER SEPARATION. Proc IEEE Int Conf Acoust Speech Signal Process 2017:246-250
Luo, Yi; Chen, Zhuo; Hershey, John R et al. (2017) DEEP CLUSTERING AND CONVENTIONAL NETWORKS FOR MUSIC SEPARATION: STRONGER TOGETHER. Proc IEEE Int Conf Acoust Speech Signal Process 2017:61-65
Yildiz, Izzet B; Mesgarani, Nima; Deneve, Sophie (2016) Predictive Ensemble Decoding of Acoustical Features Explains Context-Dependent Receptive Fields. J Neurosci 36:12338-12350
Moses, David A; Mesgarani, Nima; Leonard, Matthew K et al. (2016) Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity. J Neural Eng 13:056004
Räsänen, Okko; Nagamine, Tasha; Mesgarani, Nima (2016) Analyzing Distributional Learning of Phonemic Categories in Unsupervised Deep Neural Networks. Cogsci 2016:1757-1762
Khalighinejad, Bahar; Long, Laura Kathleen; Mesgarani, Nima (2016) Designing a hands-on brain computer interface laboratory course. Conf Proc IEEE Eng Med Biol Soc 2016:3010-3014
Hullett, Patrick W; Hamilton, Liberty S; Mesgarani, Nima et al. (2016) Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli. J Neurosci 36:2014-26

Showing the most recent 10 out of 11 publications