9314959 Cole This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between the Oregon Graduate Institute and the International Computer Science Institute. The proposed project considers research on: a) signal representation methodologies that would combat environmental noise and non-linearities which detract from automatic speech recognition, b) speaker variability which is a source of uncertainty also in recognition, and is to be tackled from a probabilistic point of view from collected (telephone) speech corpus, and, c) dialogue enhancement methodologies and strategies, including recovery, in order to increase the probability of correct recognition. Each of these three areas comprise major deterrents to robust recognition that will be investigated. The overall objective of the project is to consider integrated approaches to spoken-language system robustness. January 5, 1994 Abstract IRI-9314946 $ 259,339 - 12 mos. Flanagan, James L.; Levinson, Steve E.; Lin, Qiguang and Davis, Donald Rutgers University Computational Models for Speech Generation __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology for cooperation between researchers at Rutgers University s Center for Computer Aids for Industrial Productivity (CAIP), AT&T Bell Laboratories and General Dynamics Electric Boat Division. The research to be conducted concerns the generation of speech signals in terms of (a) an articulatory description of the vocal system, and (b) a fluid dynamic solution to the generation, propagation, and radiation of audible sound produced by the acoustic system. This includes the computation of the speech signal from first principles, using the Navier-Stokes description of fluid flow, already demonstrated feasible. Anticipated results include a potentially significant improvement in the quality of synthesized speech and fundamentally new and more robust designs of speech recognizers stemming from a better understanding of the speech phenomena and how it can be made more immune to interference. Also it is expected that this research influence improvements in the coding of speech at lower bit rates. January 5, 1994 Abstract IRI-9314967 $185,250 - 12 mos. Price, P. J., E. E. Shriberg, H. Clark, and S. Shattuck-Hufnagel SRI International Modeling Disfluencies in Spontaneous Speech __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between researchers at SRI International, Stanford University and the Massachusetts Institute of Technology. The proposed project considers research on the hypothesis that disfluencies in spontaneous speech - pauses, repeated words, repairs, filled pauses, word fragments, and elongated segments - are far from random and that knowledge about their regularity would shed light on aspects of human cognition and provide principled methods for dealing with them in spontaneous speech processing. Several relevant disciplines are involved in this effort such as human-computer interaction, linguistics, psycho-linguistics, computational linguistics, prosody, and speech technology. The multi-disciplinary approach includes investigating the forms and distribution of disfluencies across many corpora, conducting perceptual experiments to assess the saliency of specific cues in the signal, and developing and evaluating new methods for automatic processing of speech. Abstract IRI-9314955 $123,610 - 12 mos. Pustejovsky, J. and B. Boguraev Brandeis University A Core Lexical Engine: The Contextual Determination of Word Sense __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between r esearchers at Brandeis University and at Apple Computer. The research involves the building of a generative lexical engine at the core which aids in the determination of word sense following lexical properties considered composable in a given phrasal context. The project instantiates a lexical semantic segment of a substantial fragment of English by constructing such core lexical engine with the following components: a semantic typing system, relational structures for all categories, and generative mechanisms enabling extension and identification of word sense in context. The project is complementary to other initiatives to develop linguistic infrastructure resources on a large scale, such as COMLEX and WordNet. The project develops mechanisms that carry out the specialized lexical inferences that result, through composition of lexical types, in word sense determination. Abstract IRI-9314969 $224,660 - 12 mos. Sleator, D. and J. Lafferty Carnegie Mellon University Grammatical Trigrams: A New Approach to Statistical Language Modeling __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between researchers from Carnegie Mellon University and International Business Machines. This project intends takes advantage of the simplicity of the classical statistical trigram model of language while augmenting it with the syntactic and semantic aspects which constrain the use of the new grammatical trigram model to advantage over the purely stochastic model. The concepts of probabilistic link grammars are used in this research, incorporating trigrams into a unified framework for modeling long-distance grammatical dependencies in computationally efficient ways. The methods proposed are expected to have greater predictive power over current methods from the point of view of entropy measurements, and to integrate finite- state automata models and new statistical estimation a lgorithms with modern powerful machines resulting in improved speech recognition, translation, and understanding systems. Abstract IRI-9314961 $274,761 - 12 mos. Thomason, R. H., and J. R. Hobbs University of Pittsburgh Integrated Techniques for Generation and Interpretation __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between the University of Pittsburgh and the Stanford Research Institute. This project involves research on integrating the intentional and informational perspectives in architectures for interpretation and generation of interactive discourse. Particular problems investigated include the recognition by the listener of the speaker s plan, a formalization of the notion of conversational record, discourse structure from a computational point of view, and analysis of implicatures involving quantity and similar phenomena involving interactions between the processes of generation and interpretation. Bases for the research are the use of abductive inference in finite-state approximation methods and in knowledge-based systems, accommodation processes in interactive discourse, generation of coherent text, use of defeasible reasoning in plan recognition, and utterance planning to achieve communicative goals. Abstract IRI-9314992 $250,000 - 12 mos. Young, S. R. and J. G. Carbonell Carnegie Mellon University Multitext Fusion, Tracking and Trend Detection __________ This is the first year of a continuing award in the NSF/ARPA Human Language Technology initiative involving cooperation between Carnegie Mellon University and the SRA Corporation. This project involves the investigation of methodologies for the extraction of information from text and its summarization in structured data records for subsequent