Humans and other animals can discriminate and recognize sounds despite substantial acoustic variability in real-world sounds. This ability depends partly on the auditory system's ability to detect and utilize high-order statistical regularities that are present in the acoustic environment. Despite numerous advances in signal processing, assistive listening devices and speech recognition technologies lack biologically realistic strategies to dynamically deal with such acoustic variability. Thus, a comprehensive theory for how the central nervous system encodes and utilizes statistical structure in sounds is essential to develop processing strategies for sound recognition, coding and compression, and to assist individuals with hearing loss. This proposal presents a novel approach towards addressing the question of how the auditory system deals with and exploits statistical regularities for identification and discrimination of sounds in two critical mammalian auditory structures (inferior colliculus, IC; auditory cortex, AC) Aim 1 is to develop a catalogue of natural and man-made sounds and their associated high-order statistics. Cluster analysis and machine learning will be applied to the sound ensembles to identify salient statistical features that can be used to identify and categorize sounds from a computational perspective. Using information theoretic and correlation based methods, Aim 2 tests the hypothesis that statistical sound regularities are encoded in neural response statistics, including firing rate and spike-timing statistics of IC and AC neurons.
Aim 3 will determine neurometric response functions and addresses the hypothesis that high-order statistical regularities in sounds can be discriminated based on temporal pattern and firing rate statistics of single neurons in IC and AC.
Aim 4 will employ multi-site recording electrode arrays to tests the hypothesis that neural populations in IC and AC use high-order statistics for sound discrimination and that statistical regularities are encoded by regionally distributed differences n the strength and timing of neural responses or neuron-to-neuron correlations. The study will provide the groundwork for developing a general theory for how the brain encodes and discriminates sounds based on high-order statistical features. A catalogue of neural responses from single cells, neural ensembles, and high-level statistical features that differentiate real world sounds will be developed and deployed as an on-line resource. The role high-order statics play for sound recognition and discrimination will be identified both from a computational and neural coding perspective, including identifying transformations across neural structures, spatial and temporal scales. The project will foster collaborations between psychology, electrical engineering, and biomedical engineering departments at the UConn. Graduate, undergraduate and a post-doctoral student, including women and minorities, will participate in the research and will receive interdisciplinary training in areas of neurophysiology, computation neuroscience, and engineering. Drs. Read and Escabi regularly host summer interns in their labs and expect that 1-2 undergraduate students will be hosted per year. Graduate students will be enrolled in biomedical, electrical engineering, and psychology programs. Project findings will be integrated in graduate computational neuroscience and biomedical engineering coursework. The findings could lead to a host of new sound recognition technologies that make use of high-order statistical regularities to recognize and differentiate amongst sounds. Understanding how high-order statistics are represented in the brain could guide the development of optimal algorithms for detecting a target sound (e.g., speech) in variable/noisy conditions. Such sound recognition systems are also applicable in industrial applications: for instance, identifying fault machine systems from machine generated sounds. Knowledge of the statistical distributions in real world sounds and music will be useful for sound compression (e.g., mpeg coding) and to develop efficient sound processing algorithms. Finally, the findings can be incorporated in auditory prosthetics that mimic normal hearing physiology and make use of high-order sound statistics to remove background noise or enhance intelligibility.
|Khatami, Fatemeh; Wöhr, Markus; Read, Heather L et al. (2018) Origins of scale invariance in vocalization sequences and speech. PLoS Comput Biol 14:e1005996|