Hearing loss is one of the most prevalent chronic conditions, affecting 10% of the U.S. population. Although signal amplification by modern hearing aids makes sound more audible to hearing impaired listeners, speech understanding in background noise remains one of the biggest challenges in hearing prosthesis. The proposed research seeks a solution to this challenge by developing a speech segregation system that can significantly improve intelligibility of noisy speech for listeners with hearing loss, with the loner term goal of applying to hearing aid design. Unlike traditional speech enhancement and beam forming algorithms, the proposed monaural (one-microphone) solution will be grounded in perceptual principles of auditory scene analysis. There are two stages in auditory scene analysis: A simultaneous organization stage that groups concurrent sound components and a sequential organization stage that groups sound components across time. This project is designed to achieve three specific aims.
The first aim i s to improve word recognition scores of hearing-impaired listeners in background noise. The second and the third aims are to improve the sentence-level intelligibility scores in background noise and in interfering speech, respectively. To achieve the first aim, a simultaneous organization algorithm will be developed that uses the pitch cue to segregate voiced speech and the onset and offset cues to segregate unvoiced speech. To achieve aims 2 and 3, a sequential organization algorithm will be developed that groups simultaneously organized streams across time to produce a sentence segregated from background interference. Sequential organization will be performed by analyzing pitch characteristics and a novel clustering method on the basis of incremental speaker modeling. A set of seven speech intelligibility experiments involving both hearing-impaired and normal-hearing listeners will be conducted to systematically evaluate the developed system.

Public Health Relevance

A widely acknowledged deficit of hearing loss is reduced intelligibility of noisy speech. How to improve speech intelligibility of hearing impaired listener in noisy environments is a major challenge. This project will directly address this challenge and the results from the project are expected to yield technical solutions to better hearing aid design, potentially benefiting millions of individuals who suffer from hearing loss.

Agency
National Institute of Health (NIH)
Institute
National Institute on Deafness and Other Communication Disorders (NIDCD)
Type
Research Project (R01)
Project #
1R01DC012048-01A1
Application #
8436599
Study Section
Auditory System Study Section (AUD)
Program Officer
Miller, Roger
Project Start
2013-01-01
Project End
2017-12-31
Budget Start
2013-01-01
Budget End
2013-12-31
Support Year
1
Fiscal Year
2013
Total Cost
$346,930
Indirect Cost
$96,930
Name
Ohio State University
Department
Biostatistics & Other Math Sci
Type
Schools of Engineering
DUNS #
832127323
City
Columbus
State
OH
Country
United States
Zip Code
43210
Williamson, Donald S; Wang, Yuxuan; Wang, DeLiang (2016) Complex Ratio Masking for Monaural Speech Separation. IEEE/ACM Trans Audio Speech Lang Process 24:483-492
Chen, Jitong; Wang, Yuxuan; Wang, DeLiang (2016) Noise Perturbation for Supervised Speech Separation. Speech Commun 78:1-10
Chen, Jitong; Wang, Yuxuan; Yoho, Sarah E et al. (2016) Large-scale training to increase speech intelligibility for hearing-impaired listeners in novel noises. J Acoust Soc Am 139:2604
Zhang, Xiao-Lei; Wang, DeLiang (2016) A Deep Ensemble Learning Method for Monaural Speech Separation. IEEE/ACM Trans Audio Speech Lang Process 24:967-977
Healy, Eric W; Yoho, Sarah E; Chen, Jitong et al. (2015) An algorithm to increase speech intelligibility for hearing-impaired listeners in novel segments of the same noise type. J Acoust Soc Am 138:1660-9
Williamson, Donald S; Wang, Yuxuan; Wang, DeLiang (2015) Estimating nonnegative matrix model activations with deep neural networks to increase perceptual speech quality. J Acoust Soc Am 138:1399-407
Narayanan, Arun; Wang, DeLiang (2015) Improving Robustness of Deep Neural Network Acoustic Models via Speech Separation and Joint Adaptive Training. IEEE/ACM Trans Audio Speech Lang Process 23:92-101
Williamson, Donald S; Wang, Yuxuan; Wang, DeLiang (2014) Reconstruction techniques for improving the perceptual quality of binary masked speech. J Acoust Soc Am 136:892-902
Healy, Eric W; Yoho, Sarah E; Wang, Yuxuan et al. (2014) Speech-cue transmission by an algorithm to increase consonant recognition in noise for hearing-impaired listeners. J Acoust Soc Am 136:3325
Wang, Yuxuan; Narayanan, Arun; Wang, DeLiang (2014) On Training Targets for Supervised Speech Separation. IEEE/ACM Trans Audio Speech Lang Process 22:1849-1858

Showing the most recent 10 out of 12 publications