9753054 Wacholder, Nina Columbia University POWRE: Computationally Tractable Methods for Document Analysis Although impressive progress has been made in recent years on information technology systems, the output is still far from perfect. For example, information retrieval systems, which locate information relevant to a particular query, frequently miss highly relevant articles. Much of the progress in information technology is due to the effectiveness of techniques based on statistical analysis of large corpora of text; however, recent work suggests that these results can be improved with hybrid systems which skillfully combine linguistic and statistical techniques. The central hypothesis of this research is that systematic analysis of documents will provide valuable insights into linguistic properties which affect their computational tractability. The goal is to identify linguistically-motivated heuristics which are indicative of complexity of structure, vocabulary or concept; these heuristics will shed valuable light on how well different systems are able to handle different kinds of documents. Most importantly, improved understanding of the nature of the data that information technology must handle will contribute to the design of the new generation of hybrid systems. Conducting this research will help the PI, a cognitive linguist and librarian, to establish a career in a hybrid area that combines aspects of information science, linguistics, and computer science.

Project Start
Project End
Budget Start
1998-01-01
Budget End
2000-06-30
Support Year
Fiscal Year
1997
Total Cost
$61,373
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027