The goal of this project is to develop theory and implementation foundations for VGRAM, a technique that uses variable-length, high-quality grams from a collection of strings to support approximate queries on the collection. The research plan includes four tasks: 1)developing methods for VGRAM to decide an optimal set of grams automatically without requiring user-defined parameters, 2)integrating VGRAM into relational database management systems for adoption, 3) using VGRAM to support approximate keyword search in documents, and 4) evaluating VGRAM using two real applications, one for integrating Web information about family reunification and one for integrating medical information.

The research results will have significant impacts on society as approximate string queries are needed in many applications, such as data integration and record linkage. This project supports two PhD students to pursue research in the areas of text retrieval and database systems. Publications, technical reports, software and experimental data from this project will be disseminated via the project web site (http://flamingo.ics.uci.edu/).

Project Start
Project End
Budget Start
2007-09-01
Budget End
2009-02-28
Support Year
Fiscal Year
2007
Total Cost
$99,507
Indirect Cost
Name
University of California Irvine
Department
Type
DUNS #
City
Irvine
State
CA
Country
United States
Zip Code
92697