The Acoustic-Articulator Modeling EArly Grant for Exploratory Research (EAGER) investigates the use of three-dimensional Electromagnetic Articulograph (EMA) technology to create improved pronunciation error analysis and corrective articulatory feedback. This improved EMA technology is a recent development, and there are no speech or language data sets currently available that include such high resolution three dimensional position/orientation information. This EAGER investigates the potential of this type of data. A matched acoustic-kinematic corpus with five degree of freedom EMA data is being collected, across both native and nonnative (Mandarin Chinese) speakers of English. The project investigates methods for normalizing speakers' articulatory data into a common reference frame and disseminating the data to the broader research community to promote continued work in this area.

Current tools for pronunciation assessment are very limited in the specificity of the corrective pronunciation feedback that can be provided, due to limitations of our ability to accurately map acoustic inputs to corresponding articulator patterns and obtain detailed analysis of the manner and degree of pronunciation errors. The development of effective tools for Computer Aided Language Learning (CALL) is an essential component of enabling nonnative speakers of English to quickly integrate into America's 21st century global workforce. Better tools for pronunciation assessment and accent modification can have tremendous impact on workforce effectiveness, especially within key sectors related to medicine, science, technology, engineering, and mathematics, where it is important to continue to attract and support top talent from across the globe.

Project Report

To be competitive in the global economy, it is essential that people of all backgrounds be able to function together effectively despite language barriers, and development of language learning and accent modification tools is a key part of making this possible. In order to support effective learning and provide specific, useful pronunciation feedback to users, Computer Aided Language Learning (CALL) systems for pronunciation correction must be able to capture pronunciation errors and accurately identify and describe errors in articulation. Due to the difficulty of acoustic-articulator inversion and the complexities of inter-speaker differences in articulator patterns, this capacity is not yet well developed. Current systems are limited in the specificity of the corrective feedback that is provided, often only providing a "good versus bad" pronunciation match to the target and even at best only providing the general category of pronunciation error. The EAGER: Acoustic-Articulator Modeling for Pronunciation Analysis project has addressed these key limitations through collection of the Marquette Electromagnetic Articulography dataset of Mandarin-Accented English, or EMA-MAE. EMA-MAE is a matched acoustic and three-dimensional electromagnetic articulograph (EMA) dataset that includes both native American English (L1) speakers and native Mandarin Chinese (L2) speakers who speak English as a second language. Although some matched acoustic-articulator data sets have been previously collected, these include only two dimensional articulator motion along the mid-sagittal plane, and provide no orientation data. In addition, there has been little previous work making direct comparison either across languages or between L1 speakers and L2 speakers in the articulatory domain, so the collection of this new dataset represents a significant contribution. The EMA-MAE dataset allows for detailed comparisons of differences between L1 and L2 speakers and will have meaningful long term significance to speech research. In addition to the creation of the EMA-MAE dataset, we have developed a new theoretically-founded approach for calibrating the articulatory data, new methods for accurately calibrating the EMA data collection process, and new ways to use EMA data to calculate useful articulatory features. These techniques and software tools for implementing them are included with the EMA-MAE data. The broader impact of the work lies in its contribution to enabling nonnative speakers of English to more quickly integrate into the workforce, and through this to further our national goals as well as those individuals’ personal and professional goals. Increases in perceived accent are known to negatively impact judgments of employability, limit the speaker’s professional adequacy, increase the possibility of stigmatization, and present an overall context leading to serious communication breakdowns. To be competitive in the global economy, it is essential that people of all backgrounds be able to function together effectively despite language barriers, and development of language learning and accent modification tools is a key part of making this possible. The Marquette EMA-MAE corpus is a publicly available dataset, and can be downloaded at http://speechlab.eece.mu.edu/emamae .

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1142826
Program Officer
Tatiana D. Korelsky
Project Start
Project End
Budget Start
2011-08-01
Budget End
2014-01-31
Support Year
Fiscal Year
2011
Total Cost
$177,736
Indirect Cost
Name
Marquette University
Department
Type
DUNS #
City
Milwaukee
State
WI
Country
United States
Zip Code
53201