This project extends the depth of analysis of natural language understanding systems for open-domain newspaper texts by capturing important dependencies between clauses. Two co-dependent tasks are the focus of the study: (1) the determination of discourse structure and (2) the identification of referential links between anaphoric expressions and their antecedents.

This project utilizes the formal theory of discourse structure known as Segmented Discourse Representation Theory (SDRT). SDRT claims that tasks (1) and (2) strongly interact: discourse structure can rule out many possible anaphoric dependencies, while resolved anaphoric links can inform the determination of discourse relations and segments. This project tests this hypothesis in a limited setting for a restricted set of discourse relations.

Baldridge's Probabilistic Discourse Structure Parser (PDiSP) is the principal tool for addressing task (1), while a Maximum Entropy based Anaphor Resolver (AR) handles task (2). The project involves annotation for discourse structure and anaphoric dependencies to help guide and train the tools. Features relevant to both tools, like predicate argument structure, tense and aspect, are drawn from wide-coverage sentential parsers. Lexical information are drawn from Wordnet ''sanitized'' by OntoClean. The first phase of the project investigates the interdependency of tasks (1) and (2) by using anaphora-based features in the PDiSP and then features based on discourse structure accessibility in the AR. The results are then compared to base line models and to running AR and PDiSP without those features. Both software and annotated materials developed in the project will be made available to the community.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0535154
Program Officer
Tatiana D. Korelsky
Project Start
Project End
Budget Start
2006-01-15
Budget End
2008-08-31
Support Year
Fiscal Year
2005
Total Cost
$244,127
Indirect Cost
Name
University of Texas Austin
Department
Type
DUNS #
City
Austin
State
TX
Country
United States
Zip Code
78712