Benchmarking file and storage systems is challenging. The Wisconsin Next Generation Benchmarks (WiNG) project aims to provide tools and techniques to simplify the benchmarking of file and storage systems, particularly at large scales and storage capacities. The WiNG project solves three current problems in benchmarking file systems.

First, the performance of file systems can be extremely sensitive to the initial allocation of files on disk. Our Impressions framework helps evaluators to easily and quickly create representative, reproducible file system images to initialize the system under test.

Second, evaluators want to understand how file systems perform on large data sets; unfortunately, it can be costly to acquire the necessary storage capacity and time consuming to run the workloads. Compressions helps with scale: it enables users to benchmark extremely large data sets using significantly smaller storage systems and for some scenarios reduces benchmark running time by orders of magnitude.

Third, evaluators rarely have the expertise to set up and run an interesting range of representative applications. Insight enables users to easily create and run synthetic workloads representative of more complex applications.

In summary, WiNG enables developers to evaluate file systems on applications that real users care about. WiNG is developing benchmarking infrastructure and source code that can and will be used by the file system community. WiNG gives graduate students hands-on training with cutting-edge systems technology. Finally, for outreach to the wider community, WiNG enables an undergraduate to work with elementary-school children in the Scratch programming environment.

Project Report

One of the most important aspects of modern computer systems is the data within them. Data is valuable, and must be carefully protected; imagine losing all of one's family photos as an example of how precious data can be. Data also must be managed such that it can be accessed quickly; imagine if a computer system stored family photos successfully, but it took hours to retrieve any one image. The Wisconsin Next Generation benchmarks project (WiNG) has focused on building the technology and tools to help improve next-generation storage system designs and implementations. WiNG does so by focusing on four major research goals, all of which were not addressed in previous research. At the core of WiNG is the desire to build better benchmarks for storage system evaluation. Benchmarks are critical in building systems; good benchmarks put systems into commonly-used states, and exercise capabilities that are commonly used in practice. Thus, by running such a benchmark, one can learn a great deal about how an existing system behaves. For example, does it store data correctly? Does it access said data quickly enough? To improve the state of the art of file system benchmarking, the WiNG project focuses on four major goals. First, configuring storage systems before running any particular benchmark is critical; thus, within WiNG the Impressions tool was developed. Impressions enables storage researchers to properly configure the state of a file system before running any tests, thus guaranteeing a more realistic assessment of file system capabilities. Second, system evaluators need to be able to run their current software on future hardware platforms. David, a new storage system emulator, enables just that, allowing researchers to experiment with larger and faster storage on today's hardware platforms. Third, new workloads must be traced and analyzed. The iBench workload suite, based on Apple desktop workloads such as iTunes, iMovie, and iPhoto, is one such suite. By carefully understanding the nature of these modern workloads, new insights into how storage systems should be built can be gained. Finally, replaying workloads as complex as those in the iBench suite is challenging. ARTC, an approximate trace replay compiler, makes this task straightforward; by taking in multi-threaded I/O traces and producing an easy-to-run executable, ARTC makes state-of-the-art traces readily accessbile to the storage systems research community. The intellectual merits of the WiNG project all center around developing the key technologies to build the tools above. For example, Impressions relies on careful use of statistical methods in its development; David uses a new data elision technique, enabling a broad-class of systems to be emulated; finally, ARTC uses an array of novel dependency ordering techniques to realistically replay complex workloads. The broader impacts all relate to the fact that data is increasingly important in modern society. Whether within Google's datacenters or within your home, safely storing data, and making it quickly accessible, remains a critical challenge in modern computer systems. With the tools and techniques developed within WiNG, a new generation of improved storage systems should be the final result.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
1017518
Program Officer
M. Mimi McClure
Project Start
Project End
Budget Start
2010-09-01
Budget End
2013-08-31
Support Year
Fiscal Year
2010
Total Cost
$499,440
Indirect Cost
Name
University of Wisconsin Madison
Department
Type
DUNS #
City
Madison
State
WI
Country
United States
Zip Code
53715