The objectives of the proposed research are to develop and to implement a model of prosody in which the control parameters are based on physiological parameters. In particular, the model will specify the control of intonation and the state of the glottal source during the production of continuous speech. In the model, phonological descriptors of intonation will first be mapped to physiological control parameters consisting of the subglottal pressure and parameters relating to the vocal-fold tension and the glottal configuration. These physiological control parameters will then be transformed into parameters specifying the acoustic attributes of the glottal source, including the frequency (FO), amplitude, and waveform shape of the glottal pulses, and the presence of glottalization. Variations in fundamental frequency and glottal waveform due to intrinsic effects of vowel height and obstruent voicing are incorporated in the model. A second, related, objective is to implement this model and to incorporate it into an existing quasi-articulatory speech synthesizer, HLsyn. The effectiveness of the model in creating improved prosody in the synthetic speech will be evaluated, through tests in which listeners make judgments of the intelligibility and quality of the speech. Development of the model will be based on data obtained from simultaneous acoustic and physiological measures of airflows and pressures from utterances with a variety of prosodic shapes produced by several talkers. From the acoustic and aerodynamic data, some of the physiological parameters such as glottal configuration, subglottal pressure, and vocal-fold stiffness will be inferred. These data will be used to answer two questions: (1) When sequences of syllables with specified pitch accents and degrees of reduction are produced, how do the physiological measures change with time? (2) What are the equations relating the physiological parameters to the acoustic parameters that describe the glottal source? This research will lead to proposed methods for classifying prosodic elements such as pitch accents and boundary tones in terms of physiological parameters that specify how these elements are produced. It is recognized that deficiencies in respiratory and laryngeal control underlie a variety of speech disorders. The models developed here will provide some insight into how human speakers control these structures to produce the prosodic aspects of speech. This knowledge will help clinicians in the development of procedures for remediating prosodic disorders of speech production. The synthesizer will also have the capability of demonstrating to clinicians or students the consequences of deviant control of respiration and of laryngeal state.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
1R01DC004331-01A1
Application #
6167199
Study Section
Special Emphasis Panel (ZRG1-BBBP-7 (01))
Program Officer
Shekim, Lana O
Project Start
2000-08-15
Project End
2003-07-31
Budget Start
2000-08-15
Budget End
2001-07-31
Support Year
1
Fiscal Year
2000
Total Cost
$250,069
Indirect Cost
Name
Sensimetrics Corporation
Department
Type
DUNS #
City
Malden
State
MA
Country
United States
Zip Code
02148