Recent research indicates that approaches employing automated learning and training are the most promising methods for developing robust, efficient systems for processing natural language. The goal of this research project is to develop symbolic learning methods to aid the construction of natural language systems. Learning algorithms are being developed for constructing systems requiring extraction of information from natural-language documents and subsequent natural-language querying of the resulting database. A method for learning to parse natural language database questions into a formal query language has been developed and is being extended to automate the acquisition of the requisite lexicon of word meanings. Techniques are also being developed for automatically learning rules that locate relevant pieces of data in natural-language documents. Together, these methods can be used to automate the construction of systems requiring natural-language information extraction and querying. Methods for intelligently selecting informative training examples are also being developed in order to reduce the amount of annotated training data required. The resulting methods are being applied to automate the construction of systems that build a database from messages posted to an electronic newsgroup and then respond to natural language queries, resulting in more natural and effective access to electronic information.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9704943
Program Officer
William Bainbridge
Project Start
Project End
Budget Start
1997-08-01
Budget End
2001-12-31
Support Year
Fiscal Year
1997
Total Cost
$339,563
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712