The Wisconsin Hierarchically-Redundant, Decoupled storage project (HaRD) investigates the next generation of storage software for hybrid Flash/disk storage clusters. The main objective of the project is to improve the performance of storage in a variety of diverse scenarios, including new application environments such as photo storage as found in Facebook and Flickr, high-end scientific processing as found in government labs, and large-scale data processing such as that found in Google and Microsoft. The HaRD project focuses on three key issues in order to improve performance of these important applications: client-side Flash-based RAID and file-system integration, server-side memory reduction and multicore scheduling of file-system tasks, and scheduled network transfers. HaRD pulls together these technologies into a synthesized whole through three targeted storage systems: a scalable photo server, a high-performance checkpoint subsystem, and an improved file system for MapReduce workloads. The impact of this project is significant, as HaRD helps to shape the storage software architecture of the next generation of cloud computing services, which are of increasing relevance to both industry and society at large.

Project Report

Digital data is critical to modern society. For home users, such data is found in word documents, excel spreadsheets, and photo image databases; for large-scale data centers, such data includes a web search index or a social graph of friendships. In both cases, the storage of such data is essential to society. How such data is stored is changing. For the past many decades, hard-disk drives (HDDs) stored all important information. However, the advent of solid-state storage devices (SSDs) has changed how storage systems are being built. In this project, the basic tenets of classic HDD-based storage systems are reexamined to determine which aspects need to change to better support the new SSD-based world. Specifically, new interfaces to storage are found to be necessary; one such interface, known as a "nameless write", changes the classic interface to storage and thus enables a much leaner and higher performance storage stack. To support investigation of SSDs in real scenarios, two different prototyping environments are developed. The first is based entirely in software, and enables the exploration of large-scale (future) storage in small-scale (current) systems. The second is based in hardware, and makes it possible to explore how said software ideas can be truly realized in hardware form. Above the storage layer, new ideas in file systems are required as well. One such idea removes the need for write ordering within the device; others have shown that write-ordering guarantees from SSDs (and HDDs) may be suspect. Thus, a new file system that requires no such ordering but still delivers robust crash consistency is developed. In addition, file system checking has long been discarded due to lack of performance. However, in new faster systems, such a checker may again become commonly used, if designed for performance. The fast file system checker does exactly this, and delivers orders-of-magnitude improved performance when checking file system consistency; the ideas behind this faster checker are already deployed and ship to thousands of systems today. Finally, as SSDs become increasingly popular, so too will multi-SSD systems. New ideas in reliability for such systems are required. One such idea is Warped Mirroring, which enables a graceful life-to-death cycle at low cost, thus improving over standard RAID approaches. Overall, numerous new techniques and approaches to building SSD-based storage systems have been developed within this project; the result should be faster and more reliable SSD-based storage systems in research and industry.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
0937959
Program Officer
Almadena Y. Chtchelkanova
Project Start
Project End
Budget Start
2009-09-15
Budget End
2013-08-31
Support Year
Fiscal Year
2009
Total Cost
$682,000
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715