Project Proposed: This project, acquiring an nVidia Tesla architecture system, facilitates development parallel code algorithms for automatic speech recognition (ASR). The Tesla Hardware for Speech Recognition consists of a cluster of 10 nVidia Tesla rack-mounted units with associated infrastructure. New parallel codes must be developed to improve the accuracy of speech recognition, since computer systems now obey ?Core?s Law? (where the number of cores on a chip doubles once every two years). The system provides a large-scale multi-core general purpose computing environment that enables the development of scalable parallel code algorithms. The instrument provides a testbed in which to investigate future scalable algorithms that are expected to facilitate continued leadership in the commercial and industrial markets in terms of advanced and novel algorithms. Thus, envisioning the state of computing in 5 to 10 years when multi-core architectures prevail, the project aims to first - Secure appropriate computing resources for research in speech recognition and then - Eclipse the current 2- and 4-core basic desktop systems. The institute performs extensive training for machine learning and speech recognition in realistic settings with challenging acoustic properties and natural, human to human communication. Applications run from hands-free access to disabled users and natural speech-driven interfaces for the non-computer literate to automatic meeting assistants and browsers, in which meetings are recorded in real-time and tools are provided that allow access to content both during and after the meeting. The following two relevant methods improve accuracy: - Multi-stream methods that involve combinations at many levels within the system, including multiple features, multiple machine learning estimators, and multiple word-streams combinations and - Increasing the size of the training set. Careful integration of data can improve the accuracy even when the training data does not exactly match the conditions of actual application. Both methods require increasing computational power hard to fulfill with current conventional hardware.
Broader Impacts: The acquisition contributes to continue attracting young researchers and aiding in their training. The improved computational capability facilitates the demonstration of speech research to local high school students. The BFOIT Foundation for Opportunities in Information Technology aims to attract more women and underrepresented minorities in computer science and engineering.