Commodity systems are changing the face of storage and I/O. Commodity disks are cheap, but unreliable. Commodity file systems are widely used, but bug prone. Despite these critical problems, people rely increasingly on commodity computing systems to store data of critical importance, whether from an emotional (e.g., family photos), business (e.g., a customer database), or legal (e.g., your tax return) standpoint.
The Wisconsin Arrays in Software Project (WASP) tackles the problem of building the next generation of robust and reliable commodity RAID system. Such a RAID system must be reliable, detecting and recovering from a broad range of disk faults correctly. Such a system must also integrate properly into existing systems, thus delivering high performance and proper recovery in the case of system crashes. Finally, such a system must be flexible, enabling new protection strategies to be developed and deployed to cope with new problems as they arise. WASP has two major components: the first is a RAID design checker known as Sting, which ensures that protection schemes correctly protect data from a given set of disk faults. The second is a framework to allow the specification and automatic generation of sophisticated RAID systems; known as WASP Nest, this part of the project promises a new higher-level design methodology for commodity RAIDs.
This project centered around next-generation technologies for large-scale RAID storage systems. Three major research contributions have arisen from the work. The first is the development of a new methodology to evaluate protection schemes in RAID storage systems; by applying this technique to real designs of current commercial products, this method found numerous design flaws in existing systems, and has had commercial impact as some of those systems have been modified to address said flaws. One of the major problems discovered is now known as the "parity pollution" problem, in which a single corruption within storage spreads to the parity block and thus leads to data loss or corruption within the RAID. The work shows that this problem exists in many existing systems, and suggests a novel fix that is now used in production. The second is the development of highly-accurate, scalable emulation technology for future storage systems. Currently researchers often explore how their ideas work on current technology (the existing disks or flash-based storage systems available for purchase today), thus leaving unanswered the question of how a new idea will work on future devices. To address this dilemma, David has been developed, a novel future storage system emulator. David enables such emulation through a novel technique in which the storage emulator does not actually store any data; because most benchmarks use dummy data to test the performance of the underlying storage system, this approach is successful for most use-cases. David thus enables future systems research to explore how a given idea will work on future technology. The third research contribution is in the development of new technology for user-level file systems. User-level file systems are enabling a broad new class of file systems to be built; however, when such systems crash, the result is an unstable and error-prone system. Re-FUSE, a new framework developing within this project, addresses this directly by providing a restartable framework for user-level file systems. Re-FUSE achieves this by carefully recording and monitoring said file systems, and transparently restarting them upon a crash; applications above do not notice said crash and thus continue to operate normally. Re-FUSE thus enables future user-level filesystem development to be more widely deployed, as users of said file systems no longer need worry that the user-level file system might crash and corrupt their data. Beyond research contributions and the intellectual merits therein, this work has had many broader impacts on society both through human resource development as well as educational impact. In terms of human resources, the project has led to the graduation of a Ph.D. student, numerous Masters students, and one undergraduate. ThePh.D. and Masters students are now working in industry in Silicon Valley or the Pacific Northwest; the undergraduate is now in graduate school pursuing a Ph.D. at a top-ranked institution. In this manner, the project has greatly contributed to the support of national industries and future academics. In terms of broader educational impacts, the creation of a new service learning course has been completed, and is now training students at theUniversity to go into the community's local elementary and middle schools to teach young people the basic concepts of Computational Thinking. The university students gain tremendous experience in teaching; the younger children gain insight into the beauty and wonder of Computer Science, perhaps shaping or inspiring them in their future career paths.