RI: Small: Collaborative Research: 'Houston We Have A Solution': Novel Speech Processing Advancements for Analysis of Large Asynchronous Multi-Channel Audio Corpora

Hansen, John H. L.; Sangwan, Abhijeet

Abstract

This project is focused on developing new speech processing techniques which will transform access to large asynchronous multi-channel and diverse collections of multimedia materials. In particular, the algorithms developed are being employed to create a novel multi-source and multi-scale event reconstruction system that brings together the massive archives of the Apollo lunar missions, to create experiential interaction with historical materials. Specific research advancements are focused on state of the art acoustic environment analysis, speech recognition including keyword spotting, speaker identification under adverse conditions, multimodal content alignment, and automated linking for events and entities from spoken content. Specifically, the research is developing: (i) new techniques for noise- and channel-robust acoustic processing, exploiting missing-features concepts with novel feature extraction and compensation techniques, (ii) a new articulatory framework for speech recognition for robustness to variations in speech production, (iii) environmental "sniffing" techniques to automatically characterize acoustic environments to improve robustness, and (iv) automatic detection of novel task-specific audio-events. Since the data is asynchronous, unique speech analytics techniques are being formulated to address the large number of "local loop" intercom circuits in the NASA Mission Control Center, audio recorded onboard the two Apollo spacecrafts during specific mission events, and space-to-ground radio circuits. The specific speech, language, and knowledge extraction advancements will be integrated into a new automated evaluation model that reflects specific challenges encountered in the event reconstruction task. This platform will be deployed and evaluated by actual users from the Science and Engineering Education Center (SEEC) of the University of Texas at Dallas.

Integration of robust speech processing algorithms with event reconstruction systems will have a direct and immediate impact on education, society, and government organizations. Working with NASA's Apollo mission data allows for the development of speech technology for challenging audio that contains severe communication channel artifacts, cross-talk/static/tones, and low signal-to-noise ratios. The software being developed in this project will be made available to any non-profit organization for use in audio/video search (download with training modules). Students working on senior design teams will also develop a Contact Science station to be deployed in Dallas, TX and overseen by the University of Texas in Dallas Science and Engineering Education Center to illustrate and assess student use of the advancements. As a lasting legacy for this project, this project team includes eminent historians of human space flight, who will explore opportunities to deploy this event reconstruction system in a museum setting where it can support both scholarship and public engagement, and we will make the system itself available on an open-source basis to support other researchers.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1219130
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2012-09-01
Budget End: 2016-12-31
Support Year
Fiscal Year: 2012
Total Cost: $393,309
Indirect Cost

RI: Small: Collaborative Research: 'Houston We Have A Solution': Novel Speech Processing Advancements for Analysis of Large Asynchronous Multi-Channel Audio Corpora
Hansen, John H. L. Sangwan, Abhijeet
University of Texas at Dallas, Richardson, TX, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments