The objective of the proposed research is to extend our models of vocal disorders so that we may further quantify the relationship between the acoustic aspects of voice quality and the physiology of voice production. More specifically the objective is to model and quantify aspects of vocal disorders that may be caused by variations in laryngeal and oral tract function possibly brought about by a structural alteration of the speech production system due to a pathology or changes in physiology. We hypothesize that quantification of vocal quality can be achieved through the development of interactive speech production models that include both phonatory and resonance characteristics, each capable of being separately controlled by the researcher using parameters extracted from the speech signal. There are three major aspects of the proposed research. The first is to improve the resolution of our existing models of several voice types (modal register, vocal fry, falsetto, and breathy). In addition we will extend our models to other voice types (e.g., hoarse and harsh). These models will be used to investigate the cause and effect relationships between voice production features and aspects of the acoustic signal through the use of interactive, high quality, speech synthesis. The second aspect of the proposal is to refine the articulatory speech synthesizer to allow the models of voice types to be related to speech production physiology, e.g., articulator position and movement, and vocal fold vibratory motion. The third aspect is to develop quantitative measures of speech quality and vocal dysfunction. Some illustrative examples of results to be achieved through this research are the quantification of the glottal flow waveform and its spectral characteristics and the turbulent noise characteristics that correspond to variations in vocal quality. In addition the results will relate aspects of vocal fold mass and length, aperiodicity of vocal fold motion, and glottal area to vocal quality factors. A unique aspect of the research is that the voice models and speech synthesizers will be interactive, allowing the user to adjust features of the models that are related to physiological characteristics of the speech production process. By making such parameter adjustments the researcher will be able to interactively test new hypotheses concerning the cause and severity of a vocal disorder.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
2R01DC000577-04
Application #
3217140
Study Section
Sensory Disorders and Language Study Section (CMS)
Project Start
1989-04-01
Project End
1995-03-31
Budget Start
1992-04-01
Budget End
1993-03-31
Support Year
4
Fiscal Year
1992
Total Cost
Indirect Cost
Name
University of Florida
Department
Type
Schools of Engineering
DUNS #
073130411
City
Gainesville
State
FL
Country
United States
Zip Code
32611
Formby, C; Childers, D G; Lalwani, A L (1996) Labelling and discrimination of a synthetic fricative continuum in noise: a study of absolute duration and relative onset time cues. J Speech Hear Res 39:4-18
Childers, D G; Ahn, C (1995) Modeling the glottal volume-velocity waveform for three voice types. J Acoust Soc Am 97:505-19
Childers, D G; Hu, H T (1994) Speech synthesis by glottal excited linear prediction. J Acoust Soc Am 96:2026-36
Childers, D G; Wong, C F (1994) Measuring and modeling vocal source-tract interaction. IEEE Trans Biomed Eng 41:663-71
Childers, D G; Bae, K S (1992) Detection of laryngeal function using speech and electroglottographic data. IEEE Trans Biomed Eng 39:19-25
Wu, K; Childers, D G (1991) Gender recognition from speech. Part I: Coarse analysis. J Acoust Soc Am 90:1828-40
Childers, D G; Lee, C K (1991) Vocal quality factors: analysis, synthesis, and perception. J Acoust Soc Am 90:2394-410
Childers, D G; Wu, K (1991) Gender recognition from speech. Part II: Fine analysis. J Acoust Soc Am 90:1841-56
Eskenazi, L; Childers, D G; Hicks, D M (1990) Acoustic correlates of vocal quality. J Speech Hear Res 33:298-306