Many searchable text information sources are available in the Internet. The desired documents of a user are often stored in multiple heterogeneous sources, making it difficult for a user to find desired documents in the Internet. Automatic search brokers (metasearch engines) that can invoke multiple search engines are needed so that a user can find desired documents from multiple sources using a single query. This collaborative project is to develop techniques needed for the next generation metasearch engines. Highly scalable methods that can accurately estimate the usefulness of an information source with respect to each user query are to be developed. These methods would enable the metasearch engine to search only a small number of sufficiently useful information sources for each query, avoiding wasting resources for searching useless sources. Linkages among documents are to be used to improve the retrieval of relevant documents. A concept hierarchy is to be used to disambiguate the meanings of terms. Techniques that while guaranteeing the retrieval of all potentially useful documents from a local search engine, also minimize the retrieval of useless documents are to be investigated. Good solutions to the problem of merging results from multiple sources into a single ranked list are also sought. Developed techniques will be shown to significantly improve existing ones in both efficiency and effectiveness. A metasearch engine incorporating developed techniques on top of major university search engines will be implemented and made publically accessible.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9902872
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
1999-10-01
Budget End
2002-09-30
Support Year
Fiscal Year
1999
Total Cost
$133,501
Indirect Cost
Name
Suny at Binghamton
Department
Type
DUNS #
City
Binghamton
State
NY
Country
United States
Zip Code
13902