Existing data storage systems based on the hierarchical directory-tree organization do not meet the scalability and functionality requirements for exponentially growing datasets and increasingly complex metadata queries in large-scale Exabyte-level file systems with billions of files. This project focuses on a new decentralized semantic-aware metadata organization that exploits semantics of file metadata to improve system scalability, reduce query latency for complex data queries, and enhance file system functionality.

The research has four major components: 1) exploit metadata semantic-correlation to organize metadata in a scalable way, 2) exploit the semantic and scalable nature of the new metadata organization to significantly speed up complex queries and improve file system functionality, 3) fully leverage the semantic-awareness of the new metadata organization to optimize storage system designs, such as caching, prefetching, and data de-duplication, and 4) implement the new metadata organization, complex query functions, and system design optimizations in large-scale storage systems. This project has broader impact to data-intensive scientific and engineering applications, graduate and undergraduate education, and K-12 education through its contributions to storage system research and its integration with an existing NSF-REU site award and an NSF-ITEST award.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
0937993
Program Officer
Joseph Lyles
Project Start
Project End
Budget Start
2009-08-15
Budget End
2013-07-31
Support Year
Fiscal Year
2009
Total Cost
$344,552
Indirect Cost
Name
University of Nebraska-Lincoln
Department
Type
DUNS #
City
Lincoln
State
NE
Country
United States
Zip Code
68588