We explore a novel approach to modeling all human languages by assuming that the sound characteristics of spoken languages can be covered by a universal set of acoustic units with no direct link to conventional phonetic definitions. Their corresponding models, called acoustic segment models (ASMs), can be used to decode spoken utterances into strings of such units. The statistics of these units and their co-occurrences corresponding to utterances in a training set of a particular language can be used to construct feature vectors to build vector-based language classifiers for automatic spoken language identification (LID). For spoken queries, ASM-derived feature vectors are extracted in a similar manner and then used to discriminate individual spoken languages. This collection of ASMs can be established from bottom up in an unsupervised manner, and will serve as models of acoustic alphabets to construct acoustic lexicons for speech recognition and language identification. In the project we study three fundamental issues related to UAC, namely: (1) acoustic coverage and resolution of acoustic units needed to model spoken languages; (2) complexity and discriminative power of UAC-derived features for spoken language identification; and (3) relationship of language cues with UAC units for modeling spoken languages. This research facilitates a better understanding of human identification of spoken languages through acoustic and linguistic cues, and provides mathematical modeling and computing techniques to build LID systems. We also intend to leverage our research results in another NSF grant on automatic speech attribute transcription (ASAT) to model salient speech cues for language characterization and their relevance to auditory perception. The entire collection of available language cues, including phones, syllables, words, prosody, and lexical cues, can also be incorporated into this synergistic approach to spoken language modeling and identification.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0639204
Program Officer
Tatiana D. Korelsky
Project Start
Project End
Budget Start
2006-08-15
Budget End
2009-01-31
Support Year
Fiscal Year
2006
Total Cost
$199,557
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332