RIA: A Testbed for the Application of Corpus Linguistics to Information Retrieval

Gauch, Susan

Abstract

Rapidly increasing storage media capabilities and spreading interconnectivity have heralded the arrival of the information age. Unfortunately, accessing online information remains an inexact science. While valuable information can be found, typically many irrelevant documents are also retrieved and many relevant ones are missed. Terminology mismatches between the user's query and document contents are one cause of retrieval failures. Expanding a user's query with related words can improve search performance, but the problem of identifying related words remains. This research uses corpus linguistics techniques to automatically discover word similarities directly from the contents of an untagged textual database and to incorporate that information in an information retrieval system. These similarities are calculated based on the contexts in which the words appear. Using these similarities, user queries are automatically expanded, resulting in conceptual retrieval rather than requiring exact word matches between queries and documents. The effects of using different algorithms to calculate the similarities and the effects of expanding different sets of query words is evaluated. In addition, the search performance of the retrieval engine serves as a task-based method for comparing the quality of word-word similarities calculated using different corpus linguistics techniques.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9409263
Program Officer: C. Suzanne Iacono

Project Start
Project End
Budget Start: 1994-08-15
Budget End: 1998-08-31
Support Year
Fiscal Year: 1994
Total Cost: $104,925
Indirect Cost

RIA: A Testbed for the Application of Corpus Linguistics to Information Retrieval
Gauch, Susan
University of Kansas, Lawrence, KS, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments