Speech carries rich linguistic information over a large range of temporal scales: the average durations of phonemes, syllables, words and sentences range from tens of milliseconds to multiple seconds, respectively. Thus, to achieve successful speech perception, the acoustic speech signal needs to be analyzed over appropriate temporal scales to interface with their respective linguistic representations. Where and how this acousto-linguistic mapping of temporal speech properties occurs is still not fully explained in current speech/language models. Here, we show how cortical processing of acoustic temporal structure in speech is modulated by higher-level linguistic analysis. This requires two essential features: (1) control over the temporal scale at which analysis occurs; (2) control over the linguistic content of the information. For (1), we use a novel sound-quilting algorithm that controls the temporal structure in speech at different temporal scales by shuffling and then stitching together speech segments of a certain length; this approach yields new ?speech quilt? signals that preserve the natural temporal structure in the original source signal only up to the set segment length, but not beyond. The segment lengths (30, 120, 480, and 960 ms) are chosen to span the typical temporal range of phonemes, syllables, and words. For (2), we manipulate speech familiarity by using recordings of bi-lingual speakers, reading from a book in English and Korean, as the source signal to create speech quilts in two languages. This approach ensures that any changes at the signal acoustics level affect both languages identically, while manipulating the linguistic percept differently. Thus, neural responses that vary as a function of segment length but are shared or similar across the two languages will suggest analysis at the signal-acoustics level, whereas neural responses that differ based on language familiarity will imply the presence of linguistic processes.
In Aim 1, we argue (using fMRI) that temporal acoustic structure in speech is extracted in superior temporal sulcus (STS) for both languages; linguistic processes, originating in inferior frontal gyrus (IFG), become engaged in a familiar language only and in turn modulate such signal-acoustics level analyses in anterior and posterior STS.
In Aim 2, we capitalize on the high temporal resolution of EEG to suggest that one potential neural mechanism for the results in Aim 1 is that neurons are able to phase-lock more to the speech quilt signal as its natural temporal structure increases (longer segment lengths), which in turn is again modulated and enhanced by speech familiarity. The results will have a significant impact on speech/language models that need to account for where and how specific temporal scales in speech interface with their linguistic representations, while also informing approaches towards clinical populations such as children struggling to decode critical temporal speech units, as in dyslexia or auditory processing disorder (APD). !

Public Health Relevance

(3 sentences) Speech is a fundamental human communication signal; if speech perception is impaired, e.g. through deafness, or breaks down, e.g. through hearing impairment or stroke, the implications are vast and include feelings of isolation, depression, and cognitive decline. The current project aims to advance our understanding of how linguistic processes modulate the analysis of speech structure that extends over different temporal scales. A better understanding of the neural representation of speech-specific temporal scales and their modulation by linguistic processes may guide therapies for children with dyslexia, inform cochlear implant processing strategies, and help advise neurosurgical interventions. ! 1!

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21DC016386-01A1
Application #
9517189
Study Section
Communication Disorders Review Committee (CDRC)
Program Officer
Shekim, Lana O
Project Start
2018-03-05
Project End
2021-02-28
Budget Start
2018-03-05
Budget End
2019-02-28
Support Year
1
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Duke University
Department
Psychology
Type
Schools of Arts and Sciences
DUNS #
044387793
City
Durham
State
NC
Country
United States
Zip Code
27705