Recent studies have demonstrated the powerful abilities of deep architectures for statistical pattern recognition. Deep architectures transform their inputs through multiple layers of nonlinear processing. Inspired by the connectivity of biological neural networks, the hidden layers of deep architectures encode hierarchical, distributed representations of complex sensory input. Theoretical results suggest that such representations are needed to solve the most difficult problems of artificial intelligence.
Previous applications of deep architectures include visual object recognition, statistical language modeling, and nonlinear dimensionality reduction. Building on these successes, this project develops new applications of deep architectures for problems in speech and audio processing. Current front ends for these problems are dominated by traditional methods in statistical modeling and signal processing. Deep architectures have the potential to overcome many limitations of current approaches.
This project has two research components with interrelated and overlapping goals. The project's first component explores unsupervised learning in convolutional neural networks. The goal of learning in these networks is to discover new features for audio event detection and automatic speech recognition. The project's second component investigates the possibility of deep learning in kernel machines. This possibility is suggested by a recently discovered family of kernel functions that mimic the computation in large, multilayer networks.
The project's research components are tightly integrated with its educational activities. The project supports two graduate students, including one female student. An important goal is to develop publicly available software for use by other researchers.