Hybrid Speech Synthesis for Voice Output Communication Aids

Hertz, Susan

Abstract

NovaSpeech proposes to develop an innovative perceptually-oriented hybrid approach to unconstrained speech synthesis for generating individualized, customized voices of either gender and any age. The system will provide human-sounding, intelligible, and mimetic speech, yet have small storage requirements, be able to support the cost-efficient addition of new voices, and be suitable for implementation on virtually any hardware platform. As a result, the technology will be well-suited to virtually any unlimited vocabulary synthesis application, but be of special benefit to speech-impaired individuals, who have a particularly great need for natural-sounding, individualized voices on a broad range of devices. With the hybrid system, individuals who know they will lose their voice due to illness or surgery will be able to cost-efficiently capture and utilize their pre-injury voice in a voice output communication aid; and all speech-impaired users will be able to obtain reliable, appropriate, individualized voices that can grow with them as they mature and age. No existing synthesis approach meets these needs, with each type of technology trading off one desirable property for another, be it low storage requirements for natural voice quality, or human voice quality for flexibility. The hybrid approach overcomes these limitations by integrating, in a novel and principled way, the best features of two well-known synthesis techniques: corpus-based waveform concatenation and rule-based formant synthesis. Capitalizing on a number of important perceptual principles, the system will prestore only a small number of intrinsic units, such as stressed vowels, from the target speaker, and synthesize other, adaptable units by rule. Thus with only a small prestored speech corpus, and a common set of rules across voices, it will produce speech that sounds like the intended speaker. In its proposed Phase II project, NovaSpeech will develop a complete hybrid prototype text-to-speech (TTS) system for eight voices in General American English, including male and female children, adults, and elderly adults (the base speakers), as well as for two speakers who know they will lose their ability to speak naturally as a result of future laryngectomies. Year 1 will be focused on exploring possible system architectures; implementing rules for adaptable units; and exploring through perceptual experiments possible strategies for storing and selecting intrinsic units. Year 2 will be focused on implementing a fully functional hybrid TTS prototype for the six base voices. By month six of year 2 at the latest, the company will verify the ability to quickly add new voices by implementing the voices of the laryngectomy patients, providing them with functional systems for their voices, and obtaining feedback from them and those who know them about the quality of the voices and system features. The ultimate objective of the hybrid project is to improve the naturalness and mimetic quality of speech synthesized from unrestricted symbolic input, with the particular goal of enhancing the utility and flexibility of voice output communication aids for speech-impaired individuals. ? ? ?

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #: 5R44DC006761-03
Application #: 7271981
Study Section: Special Emphasis Panel (ZRG1-BBBP-B (10))
Program Officer: Shekim, Lana O

Project Start: 2004-04-01
Project End: 2010-07-31
Budget Start: 2007-08-01
Budget End: 2010-07-31
Support Year: 3
Fiscal Year: 2007
Total Cost: $376,850
Indirect Cost

Institution

Name: Novaspeech, LLC
Department
Type
DUNS #: 144511263

City: Ithaca
State: NY
Country: United States
Zip Code: 14850

Related projects


NIH 2007 R44 DC	Hybrid Speech Synthesis for Voice Output Communication Aids Hertz, Susan R. / Novaspeech, LLC	$376,850
NIH 2006 R44 DC	Hybrid Speech Synthesis for Voice Output Communication Aids Hertz, Susan R. / Novaspeech, LLC	$373,451

Comments

Be the first to comment on Susan Hertz's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Related projects

Comments