One powerfully communicative aspect of language is prosody, which is the use of pitch, loudness, local rate modulation, and pauses in speech to signal its informational and affective content. Speech production deficits due to neurological damage such as stroke or due to Autism Spectrum Disorder are often characterized by prosodic irregularities. The long term objective of the proposed research program is to understand how linguistic structure and communicative context condition the spatiotemporal realization of articulatory movement during speaking. Our linguist-engineer team studies the "signatures" of prosody at the level of articulatory patterning.
The specific aims of this proposal are to understand and model how speakers differentially modulate the spatiotemporal organization of articulatory gestures as a function of the cognitive source of a break in the speech stream and how the communicative context influences the temporal flow of the speech stream in articulation as speakers interact with one another. We outline a research strategy that investigates the relation between speech initiation/cessation and the control and coordination of articulation. Speech may start, pause, or cease for a variety of reasons in addition to linguistically structured phrase edges. Some breaks in the speech stream may be cognitively "planned," such as interlocutor turn-taking in discourse. Other disruptions in the speech stream might be "unplanned," such as interruptions and word finding challenges. Our approach investigates the articulation of speech in the vocal tract at turn-taking and interruptions in structured dialogue and in the vicinity of pauses that occur for cognitive speech planning reasons. We complement this experimental work with computational modeling of phrasal junctures and pauses, and with machine learning approaches to classifying breaks in speech arising from differing sources.
The specific aims will be pursued by using articulatory movement data collected with magnetometer systems for tracking movement inside the mouth and by using our team's computational model of speech production. The experiment work and the concomitant computational modeling of the articulatory findings will provide a profile of the manner in which articulatory patterning is shapd by the larger informational structuring of utterances and by the demands of speech planning in a communicative context.

Public Health Relevance

One powerfully communicative aspect of language is prosody, which is the use of pitch, loudness, and temporal properties such as pauses in speech to signal its informational and affective content. Speech production deficits due to neurological damage such as stroke or due to Autism Spectrum Disorder are often characterized by prosodic irregularities and understanding the influence of structural prosody and its deployment in communication on the temporal flow of speech can have critical translational impact in that disfluencies are typically used as a basis for diagnosis of speech disorders. Our research uses instrumental tracking of articulatory movements during speech to provide an understanding of normative production of modulation and pauses in speech flow that could support evidence-driven assessments and treatments of prosodic breakdown in clinical populations, including deploying assistive technologies for the impaired such as automatic speech recognition and machine speech synthesis.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
2R01DC003172-15A1
Application #
8282659
Study Section
Special Emphasis Panel (ZRG1-BBBP-L (04))
Program Officer
Shekim, Lana O
Project Start
1997-07-01
Project End
2017-03-31
Budget Start
2012-04-01
Budget End
2013-03-31
Support Year
15
Fiscal Year
2012
Total Cost
$549,459
Indirect Cost
$179,803
Name
University of Southern California
Department
Miscellaneous
Type
Schools of Arts and Sciences
DUNS #
072933393
City
Los Angeles
State
CA
Country
United States
Zip Code
90089
Parrell, Benjamin; Goldstein, Louis; Lee, Sungbok et al. (2014) Spatiotemporal coupling between speech and manual motor actions. J Phon 42:1-11
Parrell, Benjamin; Lee, Sungbok; Byrd, Dani (2013) Evaluation of prosodic juncture strength using functional data analysis. J Phon 41:
Ananthakrishnan, Sankaranarayanan; Narayanan, Shrikanth (2009) Unsupervised Adaptation of Categorical Prosody Models for Prosody Labeling and Speech Recognition. IEEE Trans Audio Speech Lang Processing 17:138-149
Kalinli, Ozlem; Narayanan, Shrikanth (2009) Prominence Detection Using Auditory Attention Cues and Task-Dependent High Level Information. IEEE Trans Audio Speech Lang Processing 17:1009-1024
Ananthakrishnan, Sankaranarayanan; Narayanan, Shrikanth (2008) A NOVEL ALGORITHM FOR UNSUPERVISED PROSODIC LANGUAGE MODEL ADAPTATION. Proc IEEE Int Conf Acoust Speech Signal Process ICASSP :4181-4184
Vivek Kumar Rangarajan, Sridhar; Narayanan, Shrikanth; Bangalore, Srinivas (2008) MODELING THE INTONATION OF DISCOURSE SEGMENTS FOR IMPROVED ONLINE DIALOG ACT TAGGING. Proc IEEE Int Conf Acoust Speech Signal Process ICASSP 4518789:5033-5036
Walker, Rachel; Byrd, Dani; Mpiranya, Fidele (2008) An articulatory view of Kinyarwanda coronal harmony. Phonology 25:499-535
Meireles, A R; Barbosa, P A (2008) Lexical reorganization in Brazilian Portuguese: an articulatory study. Speech Commun 50:916-924
Sridhar, Vivek Kumar Rangarajan; Bangalore, Srinivas; Narayanan, Shrikanth S (2008) Exploiting Acoustic and Syntactic Features for Automatic Prosody Labeling in a Maximum Entropy Framework. IEEE Trans Audio Speech Lang Processing 16:797-811
Byrd, Dani; Lee, Sungbok; Campos-Astorkiza, Rebeka (2008) Phrase boundary effects on the temporal kinematics of sequential tongue tip consonants. J Acoust Soc Am 123:4456-65

Showing the most recent 10 out of 21 publications