Integrating low-level speech features into a model of speech perception

Feldman, Naomi

Abstract

Why are humans so much better than machines at recognizing speech? This research aims to measure differences between humans and machines in how they compute similarities between sounds. A computational model of speech perception will be trained on speech production data, using several types of features that are typically used in speech recognitions systems. It will then be tested on its ability to predict human listeners' responses in speech sound discrimination tasks. Results are expected to provide information about how the speech that listeners hear shapes their perception of sounds, as well as how well the information used by automatic speech recognition systems matches the information used by human listeners.

By allowing us to compare how humans and speech recognition systems use information when perceiving speech, this research will provide a tool that can help make speech recognition systems more human-like. Reverse engineering human perception can improve the way these systems generalize to new dialects, talkers, and noise conditions. This has the potential to facilitate the construction of systems for low-resource languages, broadening the impact of speech recognition technologies.

[Supported by SBE/BCS/PAC and CISE/IIS/RI]

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Behavioral and Cognitive Sciences (BCS)
Type: Standard Grant (Standard)
Application #: 1320410
Program Officer: Betty Tuller

Project Start
Project End
Budget Start: 2013-09-01
Budget End: 2016-08-31
Support Year
Fiscal Year: 2013
Total Cost: $179,724
Indirect Cost

Integrating low-level speech features into a model of speech perception
Feldman, Naomi
University of Maryland College Park, College Park, MD, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments