One powerfully communicative aspect of language is prosody, which is the use of pitch, loudness, local rate modulation, and pauses in speech to signal its informational and affective content. Speech production deficits due to neurological damage such as stroke or due to Autism Spectrum Disorder are often characterized by prosodic irregularities. The long term objective of the proposed research program is to understand how linguistic structure and communicative context condition the spatiotemporal realization of articulatory movement during speaking. Our linguist-engineer team studies the signatures of prosody at the level of articulatory patterning.
The specific aims of this proposal are to understand and model how speakers differentially modulate the spatiotemporal organization of articulatory gestures as a function of the cognitive source of a break in the speech stream and how the communicative context influences the temporal flow of the speech stream in articulation as speakers interact with one another. We outline a research strategy that investigates the relation between speech initiation/cessation and the control and coordination of articulation. Speech may start, pause, or cease for a variety of reasons in addition to linguistically structured phrase edges. Some breaks in the speech stream may be cognitively planned, such as interlocutor turn-taking in discourse. Other disruptions in the speech stream might be unplanned, such as interruptions and word finding challenges. Our approach investigates the articulation of speech in the vocal tract at turn-taking and interruptions in structured dialogue and in the vicinity of pauses that occur for cognitive speech planning reasons. We complement this experimental work with computational modeling of phrasal junctures and pauses, and with machine learning approaches to classifying breaks in speech arising from differing sources.
The specific aims will be pursued by using articulatory movement data collected with magnetometer systems for tracking movement inside the mouth and by using our team's computational model of speech production. The experiment work and the concomitant computational modeling of the articulatory findings will provide a profile of the manner in which articulatory patterning is shapd by the larger informational structuring of utterances and by the demands of speech planning in a communicative context.

Public Health Relevance

One powerfully communicative aspect of language is prosody, which is the use of pitch, loudness, and temporal properties such as pauses in speech to signal its informational and affective content. Speech production deficits due to neurological damage such as stroke or due to Autism Spectrum Disorder are often characterized by prosodic irregularities and understanding the influence of structural prosody and its deployment in communication on the temporal flow of speech can have critical translational impact in that disfluencies are typically used as a basis for diagnosis of speech disorders. Our research uses instrumental tracking of articulatory movements during speech to provide an understanding of normative production of modulation and pauses in speech flow that could support evidence-driven assessments and treatments of prosodic breakdown in clinical populations, including deploying assistive technologies for the impaired such as automatic speech recognition and machine speech synthesis.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
4R01DC003172-19
Application #
9036989
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Shekim, Lana O
Project Start
1997-07-01
Project End
2017-03-31
Budget Start
2016-04-01
Budget End
2017-03-31
Support Year
19
Fiscal Year
2016
Total Cost
Indirect Cost
Name
University of Southern California
Department
Miscellaneous
Type
Schools of Arts and Sciences
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90032
Harper, Sarah; Lee, Sungbok; Goldstein, Louis et al. (2018) Simultaneous electromagnetic articulography and electroglottography data acquisition of natural speech. J Acoust Soc Am 144:EL380
Gupta, Rahul; Audhkhasi, Kartik; Jacokes, Zach et al. (2018) Modeling multiple time series annotations as noisy distortions of the ground truth: An Expectation-Maximization approach. IEEE Trans Affect Comput 9:76-89
Gupta, Rahul; Audhkhasi, Kartik; Lee, Sungbok et al. (2016) Detecting paralinguistic events in audio stream using context in features and probabilistic decisions. Comput Speech Lang 36:72-92
Gupta, Rahul; Bone, Daniel; Lee, Sungbok et al. (2016) Analysis of engagement behavior in children during dyadic interactions using prosodic cues. Comput Speech Lang 37:47-66
Ramanarayanan, Vikram; Van Segbroeck, Maarten; Narayanan, Shrikanth S (2016) Directly data-derived articulatory gesture-like representations retain discriminatory information about phone categories. Comput Speech Lang 36:330-346
Bone, Daniel; Lee, Chi-Chun; Black, Matthew P et al. (2014) The psychologist as an interlocutor in autism spectrum disorder assessment: insights from a study of spontaneous prosody. J Speech Lang Hear Res 57:1162-77
Parrell, Benjamin; Goldstein, Louis; Lee, Sungbok et al. (2014) Spatiotemporal coupling between speech and manual motor actions. J Phon 42:1-11
Bone, Daniel; Lee, Chi-Chun; Narayanan, Shrikanth (2014) Robust Unsupervised Arousal Rating: A Rule-Based Framework with Knowledge-Inspired Vocal Features. IEEE Trans Affect Comput 5:201-213
Kim, Jangwon; Lammert, Adam C; Ghosh, Prasanta Kumar et al. (2014) Co-registration of speech production datasets from electromagnetic articulography and real-time magnetic resonance imaging. J Acoust Soc Am 135:EL115-21
Ramanarayanan, Vikram; Lammert, Adam; Goldstein, Louis et al. (2014) Are articulatory settings mechanically advantageous for speech motor control? PLoS One 9:e104168

Showing the most recent 10 out of 37 publications