This research addresses one of the remaining challenges in speech science, high quality speech simulation. We define speech simulation as a form of speech synthesis in which the movement of air and tissue is under experimental control, rather than the resulting acoustic signal. From the early days of speech synthesis nearly a half century ago, the expectation has always been that a better representation of the laws of physics of air and tissue in motion would produce better synthesis. Although this expectation still exists today, the payoff has been slow, primarily because there are few data sets from which to build theoretical generalizations. In this proposal, the principal investigator and his colleagues draw upon experience gained with simulation of the phonatory processes to include the entire vocal tract in sentence-level speech production. The first phase will be to obtain naturalness in speech quality that is comparable to formant synthesis by modeling a few specific speakers from whom extensive data sets will be available. The second phase will be to develop scaling and modification rules that will allow the voice of a given speaker to be transformed into a different age, gender, emotion, and voice quality. The transformation will also include induced or corrected voice and speech disorders. The idea of voice transformation (conversion) is not new, but the attempt to do it all in the articulatory domain is relatively untried. The results will have practical and theoretical impact on the development of assistive devices for voice/speech impaired populations, for surgery performed on the larynx and upper respiratory tract, and for speech training and rehabilitation.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
5R01DC002532-05
Application #
6030194
Study Section
Sensory Disorders and Language Study Section (CMS)
Project Start
1995-07-01
Project End
2001-06-30
Budget Start
1999-07-01
Budget End
2001-06-30
Support Year
5
Fiscal Year
1999
Total Cost
Indirect Cost
Name
Denver Center for the Performing Arts
Department
Type
DUNS #
010615888
City
Denver
State
CO
Country
United States
Zip Code
80204
Titze, Ingo R (2002) Regulating glottal airflow in phonation: application of the maximum power transfer theorem to a low dimensional phonation model. J Acoust Soc Am 111:367-76
Bergan, C C; Titze, I R (2001) Perception of pitch and roughness in vocal signals with subharmonics. J Voice 15:165-75
Titze, I R (2001) Acoustic interpretation of resonant voice. J Voice 15:519-28
Story, B H; Titze, I R; Hoffman, E A (2001) The relationship of vocal tract shape to three voice qualities. J Acoust Soc Am 109:1651-67
Story, B H; Titze, I R; Hoffman, E A (1998) Vocal tract area functions for an adult female speaker based on volumetric imaging. J Acoust Soc Am 104:471-87