NovaSpeech proposes to study an innovative hybrid technique for generating individualized child, adult, female, and male voices. This hybrid speech generation strategy is particularly well suited to the needs of the speech-impaired community. It integrates corpus-based waveform concatenation and rule-based formant synthesis in a novel and principled way, leveraging the strengths of each technique. The proposal presents both a set of preliminary results and a theoretical foundation for expecting that the proposed method will be capable of efficiently generating higher quality speech than is possible using either classical formant or more current, corpus-based techniques. The hybrid system promises to make the generation of mimetic speech--speech that sounds like a designated speaker--more practical and successful, thereby enabling the creation of more effective and satisfactory voice output communication aids.
The specific aim of Phase I of the proposed study is to test four key hypotheses by recording child, adult, female, and male speakers; generating spoken test sentences using a variety of hybrid and classical synthesis techniques; and performing formal listening tests to obtain perceptual judgments of the naturalness and mimetic quality of the test sentences. The longer-term objective in Phase II is to develop a complete hybrid speech synthesis system for the generation of unrestricted speech from an abstract linguistic representation. The ultimate objective is to improve the naturalness and mimetic quality of speech synthesized from unrestricted symbolic input, with the particular goal of enhancing the utility and flexibility of voice output communication aids for speech-impaired individuals.