Auditory Scene Analysis with Complex Sounds

McDermott, Josh

Abstract

Perhaps the most pervasive problem faced by listeners with hearing impairment or cochlear implants is the difficulty of recognizing speech and other sounds in the presence of competing sound sources, as when conversing at a restaurant. This difficulty in ?sound segregation? ? hearing a particular sound of interest when it is embedded in a mixture of other sounds ? often leads to frustration and social isolation, and is not adequately addressed by current hearing aids and implants. Sound segregation difficulties are also commonly reported in developmental auditory disorders. The long-term goal of the proposed research is to reveal the basis of sound segregation and to provide insights that will facilitate improved prosthetic devices and remediation strategies, as well as more effective machine systems for processing sounds, e.g. for automatic speech recognition. The development of more effective devices, technologies, and therapies is currently limited by an incomplete understanding of the factors that underlie sound segregation by normal-hearing listeners. In particular, little is known about sound segregation with complex naturalistic sounds, in part because much of the research in this area has been conducted using simple signals that are impoverished relative to the sounds listeners normally encounter. We propose to enrich the understanding of sound segregation with three sets of experiments that use novel sound synthesis methods to manipulate properties of natural speech and other sounds and test their role in segregation with behavioral experiments in human listeners.
Aim 1 manipulates the classically proposed grouping cue provided by harmonic frequency relations and investigates the mechanisms underlying their effect.
Aim 2 investigates the role of prior knowledge of voice and speech structure on segregation, and should help to explain why some voices are easier or harder to segregate than others.
Aim 3 investigates the role of attentive tracking in the segregation of sounds from mixtures, and will explore the factors that facilitate tracking or cause it to fail. The results will reveal the mechanisms underlying sound segregation by the healthy auditory system, and will provide insights into the factors that limit auditory comprehension in the presence of multiple sound sources, hopefully suggesting new strategies for signal enhancement, prosthetic devices, and behavioral remediation.

Public Health Relevance

People with normal hearing are usually able to recognize and understand sounds of interest even when they are embedded in mixtures of other sounds, but this ability is often severely compromised in listeners with impaired hearing. The proposed research will enrich the understanding of the mechanisms underlying this ability, and its limitations. The results will likely provide insight into the design of more effective signal enhancement and source separation algorithms for use in prosthetic devices such as hearing aids and cochlear implants.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute on Deafness and Other Communication Disorders (NIDCD)
Type: Research Project (R01)
Project #: 5R01DC014739-02
Application #: 9339650
Study Section: Auditory System Study Section (AUD)
Program Officer: Miller, Roger

Project Start: 2016-09-01
Project End: 2021-08-31
Budget Start: 2017-09-01
Budget End: 2018-08-31
Support Year: 2
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: Massachusetts Institute of Technology
Department: Other Basic Sciences
Type: Schools of Arts and Sciences
DUNS #: 001425594

City: Cambridge
State: MA
Country: United States
Zip Code: 02142

Related projects


NIH 2020 R01 DC	Auditory Scene Analysis with Complex Sounds McDermott, Josh H. / Massachusetts Institute of Technology
NIH 2019 R01 DC	Auditory Scene Analysis with Complex Sounds McDermott, Josh H. / Massachusetts Institute of Technology
NIH 2018 R01 DC	Auditory Scene Analysis with Complex Sounds McDermott, Josh H. / Massachusetts Institute of Technology
NIH 2017 R01 DC	Auditory Scene Analysis with Complex Sounds McDermott, Josh H. / Massachusetts Institute of Technology
NIH 2016 R01 DC	Auditory Scene Analysis with Complex Sounds McDermott, Josh H. / Massachusetts Institute of Technology	$376,358

Publications

Woods, Kevin J P; McDermott, Josh H (2018) Schema learning for the cocktail party problem. Proc Natl Acad Sci U S A 115:E3313-E3322

McPherson, Malinda J; McDermott, Josh H (2018) Diversity in pitch perception revealed by task dependence. Nat Hum Behav 2:52-66

Popham, Sara; Boebinger, Dana; Ellis, Dan P W et al. (2018) Inharmonic speech reveals the role of harmonicity in the cocktail party problem. Nat Commun 9:2122

McWalter, Richard; McDermott, Josh H (2018) Adaptive and Selective Time Averaging of Auditory Scenes. Curr Biol 28:1405-1418.e10

Woods, Kevin J P; Siegel, Max H; Traer, James et al. (2017) Headphone screening to facilitate web-based auditory experiments. Atten Percept Psychophys 79:2064-2072

Comments

Be the first to comment on Josh McDermott's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: