Machine learning algorithms for analyzing auditory scenes with multiple sound sources

Sejnowski, Terrence; Lee, Te-Won

Abstract

Computer algorithms that analyze auditory scenes and extract individual sound sources would have a strong impact in several domains. For example, to facilitate natural interaction with computing devices by voice, an automatic speech recognition system must be able to focus on the voice of the person speaking to it and ignore sounds from all other sources. A hearing device must perform a similar task to allow a hearing impaired person conduct a conversation in a noisy, multiple source environment. Building on recent advances in the fields of machine learning and signal processing, we are developing sophisticated adaptive algorithms for analyzing auditory scenes with multiple sound sources. Our algorithms are based on probabilistic modeling of different sound sources and of the manner in which they overlap each other and distorted by reverberation and background noise. We use advanced recent techniques for inferring our models from sound data captured by a microphone array, separating those data into individual sources, and automatically determining the type of each source present and its location. Moreover, by reconstructing the clean signal of individual sound sources, we dramatically enhance the accuracy of automatic speech recognition for human speakers in multiple source environments. To facilitate the development and evaluation of our algorithms, and also to encourage competition between other research groups ultimately resulting in improved techniques, we collect a large dataset of multiple source auditory scenes, and make it publicly available on a dedicated website.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 0535251
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2006-01-01
Budget End: 2009-12-31
Support Year
Fiscal Year: 2005
Total Cost: $375,000
Indirect Cost

Machine learning algorithms for analyzing auditory scenes with multiple sound sources
Sejnowski, Terrence Lee, Te-Won
University of California San Diego, La Jolla, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments