Many applications including geographical information systems, scientific visualization and igh-dimensional databases include large data repositories up to terabytes in size. Although terabytes of storage space is now achievable, efficient retrieval is a challenging problem. This research involves novel schemes to distribute data among parallel disks for efficient retrieval including schemes using replication, efficient retrieval algorithms and self-organizing schemes that adapt to disk failures, disk additions and changing query patterns.

Declustering has attracted a lot of interest over the last few years and has applications in many areas including high-dimensional data management, geographical information systems and scientific visualization. Most of the declustering research have focused on spatial range queries and finding schemes with low worst-case additive error. This research investigates various aspects of declustering including novel declustering schemes, replicated declustering, heterogeneous declustering, adaptive declustering and declustering using multiple databases. The investigators approach every issue both theoretically and practically, study what is theoretically possible, what can be achieved in practice and try to close the gap between the two. The investigators study novel declustering schemes with solid theoretical foundations including number-theoretic declustering and design-theoretic declustering. Replication strategies for various types of queries including spatial range queries and arbitrary queries are studied. Retrieval algorithm for design-theoretic replication has linear complexity and guarantees worst-case retrieval cost. The investigators study tradeoffs in retrieval between complexity and retrieval cost and develop a suite of protocols for retrieval. This research involves adaptive declustering schemes that adapt to disk failures, disk additions and changing query types by moving buckets between disks during idle periods.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0702728
Program Officer
Almadena Y. Chtchelkanova
Project Start
Project End
Budget Start
2007-05-01
Budget End
2011-04-30
Support Year
Fiscal Year
2007
Total Cost
$305,697
Indirect Cost
Name
University of Texas at San Antonio
Department
Type
DUNS #
City
San Antonio
State
TX
Country
United States
Zip Code
78249