The Wisconsin SAFE project is developing the core technology to both analyze and construct more reliable storage systems. When analyzing the fault-handling machinery of storage systems, there are two conflicting goals: thoroughness and efficiency. The SAFE project accomplishes both goals by injecting faults using "gray box" knowledge of the systems in order to selectively exercise their fault-handling machinery. By applying this technique, the SAFE project can assess the level of robustness in current storage technology, including commercial file systems and distributed storage systems. When constructing more reliable storage systems, a further goal is to ensure that the improvements can be deployed in practice. Thus, one approach taken in this project is to transparently insert a failure-handling layer between the file system and the disk it manages; in this way, fault-resilient storage systems can be deployed without cooperation from either developers or disk manufacturers. The sum of these techniques for failure-analysis and failure-handling is a new generation of more robust and reliable storage systems.
Broader significance: Storage systems are becoming increasingly central to everyone's "digital" lives. Hence, techniques that improve the robustness of file storage, ensuring that valuable data is not lost or altered, are of paramount importance. Virtually all domains of computing, from home users on standard PCs to high-end web services such as Google, use standard file systems as their main repository of data; improving the robustness of standard file systems improves all of these systems as well.