We propose to develop efficient techniques for search and retrieval for very large text and image repositories. We will develop algorithms to perform compressed domain exact and approximate pattern matching on text and images, when the compression is based on the family of algorithms that depend on sorted contexts. We will also investigate compressed pattern search for context-based predictive image compression schemes, although the contexts used by these algorithms are not sorted contexts. We will develop search-aware compression schemes that will support compressed-domain search directly on the compressed data with minimal or no decompression of the compressed data. We will also develop software tools and create an integrated global compression/search utility website for lossless compression of text and images and for efficient search of large collections directly in their compressed form. We will develop compressed domain search engines using compressed keywords or inversion dictionaries to expedite search operations for terabyte scale text and image repositories. The broader impact of the proposed activity of this research will be efficient utilization of storage, computation and communication resources. Graduate students will be trained in doing research and state-of-the-art knowledge in data compression and information retrieval technology will be transferred to classrooms

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
0312724
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2003-09-15
Budget End
2007-08-31
Support Year
Fiscal Year
2003
Total Cost
$246,000
Indirect Cost
Name
University of Central Florida
Department
Type
DUNS #
City
Orlando
State
FL
Country
United States
Zip Code
32816