People are remarkably adept at making sense of the world through sound: understanding speech in a noisy restaurant, picking out the voice of a family member, or recognizing a familiar melody. Although we take these abilities for granted, they reflect impressive computational feats of biological engineering that are remarkably difficult to replicate in machine systems. The long-term goal of my research program is to develop computational and experimental methods to reverse-engineer how the brain codes natural sounds like speech and to exploit these advances to understand and aid in the treatment of hearing impairment. One of the central challenges of coding natural sounds is that they are structured at many different timescales from milliseconds to seconds and even minutes. How does the brain integrate across these diverse timescales to derive meaning from sound? Answering this question has been challenging because there are no general-purpose methods for measuring neural timescales in the brain. As a consequence, we know relatively little about how neural timescales are organized in auditory cortex and how this organization enables the coding of natural sounds. To overcome these limitations, we develop a simple experimental paradigm (the ?temporal context invariance? or TCI paradigm) for estimating the temporal integration period of any sensory response: the time window during which stimuli alter the response. We apply the TCI method to human electrocorticography (ECoG) and animal physiology recordings to reveal the organization of neural timescales at both the region and single-cell level (Aim I). Pilot data from our analyses reveal that timescales are organized hierarchically, with higher-order regions showing substantially longer integration periods. To explore the functional significance of this timescale hierarchy, we couple TCI with computational techniques well-suited for characterizing natural sounds (Aim II). We test whether increased integration periods enable a more noise-robust representation of speech (Aim IIA), whether regions with longer integration periods code higher-order properties of natural sounds (Aim IIB&IIC), whether there are dedicated integration periods for important sounds categories like speech or music (Aim IID), and whether cortical integration periods can be explained by the duration of the features they respond to (Aim IIE). In the process of conducting this research, I will be trained in two critical areas: (1) ECoG, which is the only method with the spatial and temporal precision to understand how neural timescales are organized in the human brain (2) deep neural networks (DNN) which are the only models able to perform challenging perceptual tasks at human levels and predict neural responses in higher-order cortical regions. After completing this training, I will have a unique set of experimental (fMRI, ECoG, psychophysics) and computational skills (data-driven statistical modeling and hypothesis-driven DNN modeling), which will facilitate my transition to an independent investigator.
Natural sounds like speech contain information at many different timescales (e.g. phonemes, syllables, words), but how the human brain extracts this information remains unclear. Understanding this process is critical to understanding how hearing impairment degrades speech perception. The proposed research will reveal the organization of neural timescales in the brain, and how this organization facilitates the coding of natural sounds like speech, which is a critical first step in understanding how this code is impaired by hearing loss.