This project encompasses development and testing of a concatenative text-to-speech (TTS) synthesis system known as ModelTalker and software called InvTool which guides individuals in creating personal synthetic voices for use with ModelTalker. The overall system is intended to be of particular interest to Augmentative and Alternative Communication (AAC) device users who depend upon speech synthesis for communication. In addition to improved naturalness and intelligibility, the ModelTalker and InvTool software uniquely offer the capability of rapid development of personal concatenative synthesis voices. With InvTool, individuals such as those with ALS who are at risk of losing the ability to speak can record their own speech for conversion to a personal synthetic voice for the ModelTalker TTS system. This voice banking capability has already been used successfully by a number of ALS patients. This Phase II STTR application seeks funding to complete transfer of the ModelTalker and InvTool technology from the research laboratory in which it was developed to a small business for commercialization. Phase II activities will focus on InvTool and the process of automatically constructing highly intelligible personal synthetic voices for ModelTalker. In particular, our specific aims for Phase II are:
Aim 1 ? Enhance the usability of InvTool. We have identified several specific improvements to the InvTool program that will (a) improve general ease of use, (b) improve accessibility for visually impaired users, and (c) simplify the speech recording process, especially for young users and users with more limited vocabulary and literacy skills.
Aim 2 ? Implement and test novel speech processing techniques to improve the robustness of our automatic voice construction process while reducing the storage requirements needed for the speech database.
Aim 3 ? Evaluate the quality (intelligibility and acceptability) of automatically created voices for a representative sample of individuals who can benefit from this technology recruited through multiple speech clinics around the country.
DiCanio, Christian; Nam, Hosung; Whalen, Douglas H et al. (2013) Using automatic alignment to analyze endangered language data: testing the viability of untrained alignment. J Acoust Soc Am 134:2235-46 |