This Small Business Innovation Research (SBIR) Phase II project addresses the problem of everyday noisy environments that limit when and where people can be heard clearly over various communication devices (phones, first-responder radios, voice-over-IP, etc.). Given the inability of single-microphone noise-reduction techniques to handle non-stationary noise (e.g. restaurant noise) the industry's current solution is the use of dual-microphone techniques. However, dual-microphone techniques cost more, require more hardware, need spatial separation between the noise and speech, and can only process the signal on the transmit side. The proposed technology is a single-microphone software-only solution that effectively handles non-stationary noise coming from any direction and can process the signal on both the transmit and receive sides. The novelty of the approach is its use of speech-specific characteristics and knowledge of human perception to extract speech from the noisy signal. The objective of the proposed research is to improve the current method of detecting voice activity, enabling better speech extraction thereby resulting in enhanced speech quality. The resulting technology will be architecture agnostic, cost effective and have superior performance in everyday situations.

The broader impact/commercial potential of this project is considerable and compelling. The initial focus will be the mobile phone industry, which is now the single largest user of noise suppression products with a market size that includes nearly all of humanity. The technology will then be optimized to improve the listening experience for hundreds of millions of potential hearing-aid/cochlear implant users worldwide. According to the World Health Organization, there are 278 million people worldwide that have hearing loss. This number is expected to at least double over the next 30 years due to the growth in the number of senior citizens over 65 years of age and the growth in the number of younger people who are needing hearing aids 20 years sooner than their parents due to loud-music listening habits. Additional markets include first-responder radios, voice-over-IP and military/intelligence/homeland-security. The research required to reduce computational complexity will illuminate the essential aspects of auditory scene analysis needed for improved speech perception.

Project Start
Project End
Budget Start
2012-08-15
Budget End
2016-07-31
Support Year
Fiscal Year
2012
Total Cost
$1,096,090
Indirect Cost
Name
Omnispeech
Department
Type
DUNS #
City
College Pard
State
MD
Country
United States
Zip Code
20742