RI: Medium: Collaborative Research: Explicit Articulatory Models of Spoken Language, with Application to Automatic Speech Recognition

Bilmes, Jeffrey

Abstract

Proposal Title: RI: Medium: Collaborative Research: Explicit Articulatory Models of Spoken Language, with Application to Automatic Speech Recognition Institution: Toyota Technological Institute at Chicago Abstract Date: 05/22/09 This award is funded under the American Recovery and Reinvestment Act of 2009 (Public Law 111-5). One of the main challenges in automatic speech recognition is variability in speaking style, including speaking rate changes and coarticulation. Models of the articulators (such as the lips and tongue) can succinctly represent much of this variability. Most previous work on articulatory models has focused on the relationship between acoustics and articulation, but more significant improvements require models of the hidden articulatory state structure. This work has both a technological goal of improving recognition and a scientific goal of better understanding articulatory phenomena. The project considers larger model classes than previously studied. In particular, the project develops graphical models, including dynamic Bayesian networks and conditional random fields, designed to take advantage of articulatory knowledge. A new framework for hybrid directed and undirected graphical models is being developed, in recognition of the benefits of both directed and undirected models, and of both generative and discriminative training. The project activities include major extension of earlier articulatory models with context modeling, asynchrony structures, and specialized training; development of factored conditional random field models of articulatory variables; and discriminative training to alleviate word confusability. The scientific goal addresses questions about the ways in which articulatory trajectories vary in different contexts. Existing databases are used, and initial work in manual articulatory annotation is being extended. In addition, the project uses articulatory models to perform forced transcription of larger data sets, providing an additional resource for the research community. Other broad impacts include new models and techniques with applicability to other time-series modeling problems. Extending the applicability of speech recognition will help it fulfill its promise of enabling more efficient storage of and access to spoken information, and equalizing the technological playing field for those with hearing or motor disabilities. NATIONAL SCIENCE FOUNDATION Proposal Abstract Proposal:0905633 PI Name:Livescu, Karen Printed from eJacket: 06/10/09 Page 1 of 1

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 0905341
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2009-07-01
Budget End: 2013-06-30
Support Year
Fiscal Year: 2009
Total Cost: $378,000
Indirect Cost

RI: Medium: Collaborative Research: Explicit Articulatory Models of Spoken Language, with Application to Automatic Speech Recognition
Bilmes, Jeffrey
University of Washington, Seattle, WA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments