ABSTRACT The aim of the project is to develop an approach to a feature- based word recognition system which can cope with the wide range of variability that can occur in the acoustic manifestation of words. The variability in the acoustic pattern of words has many different sources, including intra- and inter-speaker differences in speaking style and rate, various contextual effects, and anatomic differences between speakers. Thus, whereas linguistic descriptions use binary-valued features, the acoustic correlates of the feature have varying degrees of strength and extent over time. The research will be divided into three stages. To ensure an adequate representation of the variability that can occur, a database of around 200 words will be designed. The test words will be excised from a variety of contexts and spoken by several talkers, both males and females. Next, an acoustic study of the corpus will be conducted to supplement data in the literature on the acoustic correlates for features and to further study variability. Finally, a speaker-independent isolated word recognition system will be developed and tested with a new database of the same 200 words. The recognition system will consist of three components: property detectors for extracting the acoustic properties for features, a lexicon which will consist of some arrangement of features for each word, and a matcher for mapping between the extracted acoustic patterns and the lexical representation. There will be several ways to handle variability. First, the acoustic properties for features will be based on relative measures so that they will be minimally dependent upon the speaker's characteristics. Second, unlike traditional feature- based recognition systems, lexical access will be achieved directly from the acoustic properties rather than from postulated phonetic segments. In this way, contextual information can be taken into account more directly during the matching process. Finally, the investigator will develop a framework for expressing variability within the lexical representation of words. Although variability has several sources, it appears that the resulting acoustic changes can be expressed in terms of a few feature- altering processes including phonetic lenition, feature assimilation and some other structurally-dependent variants. Thus by incorporating the feature-altering processes in the lexical representation of words, it should not be necessary for the recognition system to have "seen" or have a rule for all of the variability that can occur in a particular word before it is able to recognize it. The results of this project should contribute to our understanding of acoustic phonetics, and also to the development of improved technologies for automatic speech recognition.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Type
Standard Grant (Standard)
Application #
8920470
Program Officer
Paul G. Chapin
Project Start
Project End
Budget Start
1990-09-15
Budget End
1991-09-30
Support Year
Fiscal Year
1989
Total Cost
$11,800
Indirect Cost
Name
Massachusetts Institute of Technology
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02139