9221276 Cottrell Learning Semantic Representations for Information Retrieval This is the first year funding of a three-year continuing award. The objective of this project is to develop methods to automatically represent text-based documents from a large collection in a way which facilitates semantically precise retrieval. A critical problem in representing documents is that words in the documents are not accurate descriptors of document content. This is in part due to the polysemy of natural language: A single concept can be described in many different ways. Most current approaches fail to account for this, as they determine semantic relevance using co-occurrence of words in documents. The approach is to index documents so that they are representationally similar when they are semantically related, not just when they coincidentally share terms. Multidimensional Scaling (MDS) and Neural Network theory are foundations of the work. This approach is demonstrated to be similar to the best current technique for statistical semantic analysis of documents: Latent Semantic Indexing (LSI). The work suggests a generalization of LSI, a linear and metric technique, to non-linear and non-metric techniques. This work is expected to provide a well-founded theoretical framework for document indexing based on MDS, to advance the use of neural network techniques in document indexing, and to help in the quantitative evaluation of current document retrieval methods. ***

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9221276
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
1993-08-01
Budget End
1997-07-31
Support Year
Fiscal Year
1992
Total Cost
$215,000
Indirect Cost
Name
University of California San Diego
Department
Type
DUNS #
City
La Jolla
State
CA
Country
United States
Zip Code
92093