OCI 0852543 Garth Gibson, Principal Investigator Carnegie Mellon University SGER: Investigating Degrading Failure Recovery in Large Scale, Heavily Utilized Disk Storage Systems
This SGER proposal responds to an interest from NARA in File Systems and I/O research, specifically, in methods and factors contributing to fault tolerant, highly scalable, sustained I/O for very large disk based systems. The P.I. had predicted in a workshop the dangers of increasing disk sizes in ever larger disk storage systems.
In order to better understand the implication of this speculation, CMU will explore the implications on performance and reliability in large scale disk storage systems of the increasing size of magnetic disks, causing the periods of degraded performance during failure recovery to grow, and the scaling of stored data capacity, causing the total number of components that might fail to grow. The P.I. and his team will explore different strategies for trading off capacity overhead, performance overhead and data reliability, for example, slower recovery in more redundant data, faster recovery with more performance degradation, and novel uses of redundancy for point solutions of specific failure modes.