Enabled by dramatic improvements in compute power, network bandwidth, disk capacity, and digital sensing devices, the last decade has seen the emergence of massive scientific data sets archived in data repositories scattered across the country. Few sites have the capacity to copy an entire data set to local storage prior to analysis or visualization. In-stead, this work explores distributed access strategies to enable remote applications to query and receive on-demand delivery of relevant fragments of large data sets. Integrated into a data grid that manages distributed data collections, these strategies monitor appli-cation access patterns and pre-fetch and cache needed portions of a data set, delivering to the remote application only the data that is needed, as it is needed. The approach is simi-lar to virtual memory paging systems found in operating systems, but uses knowledge of the spatial and temporal structure of data sets and the application's observed access pat-terns to efficiently pre-fetch, format, cache, and deliver the data. The results of this work enable distributed access to data sets far larger than local storage resources would other-wise allow. Network bandwidth is used more efficiently by transferring only the data that is needed, and local storage is used only as a cache for frequently used portions of a data set. Expected results of this work include predictive pre-fetch algorithms, cost mod-els, and a prototype implementation whose usefulness will be validated by its application to several real-world problems involving distributed access to enormous data sets.

Dr. Brett D. Fleisch Program Director, CISE/CNS June 23, 2004. .

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
0410524
Program Officer
Brett D. Fleisch
Project Start
Project End
Budget Start
2004-08-01
Budget End
2006-07-31
Support Year
Fiscal Year
2004
Total Cost
$170,000
Indirect Cost
Name
University of California San Diego
Department
Type
DUNS #
City
La Jolla
State
CA
Country
United States
Zip Code
92093