Separating signals that have been mixed together is an archetypal engineering probelm. The past decade has seen the emergence of a several approaches applicable to separating sound mixtures -- for example, a restaurant scenario in which a desired target voice must be extracted from the background babble of other patrons. However, the most appropriate goal, and hence the way to measure performance, is not always clear. In this project, the goal is established as improving intelligibility i.e. processing sound mixtures so a human listener can better understand what can be said. This requires a collaboration between computer science/electrical engineering -- to provide the separation algorithms -- and auditory scientists/psychologists -- to guide the results towards perceptually-relevant improvements, and to evaluate the results in listener tests.
The particular techniques to be developed and combined include blind source separation (such as independent component analysis), computational auditory scene analysis (simulations of what is understood about human perceptual processing), and model-driven approaches derived from the machine-learning techniques of speech recognition. One specific area of interest is the synthesis of `minimally-informative noise', acoustic tokens that effectively communicate both what can be inferred and what remains unknown about the target signal, and which can leverage the powerful perceptual inference of human listeners.
This project will lead to implementations of acoustic signal separation that deliver the greatest benefit to human listeners, potentially including both normal-hearing and hearing-impaired individuals. This has a broad range of applications from processing archival recordings through to improved real-time communications technologies, as well as the potential to help automatic speech recognition systems.