Project Proposed: This project, developing a 2.4 Petabytes (PB) of raw storage instrument to support a variety of research projects in experimental HPC and cloud storage, aims to both increase local resources for scientific computing and act as a testbed for GPU-enabled reliable storage. The instrument enables an increased virtualization of storage, the concurrent access to storage under fault scenarios (e.g., RAID), and a series of data intensive applications. Lessons learned will be leveraged from the existing system in place, whereby the existing system and the new system will be integrated in a way that supports cloud and disaster recovery modes of operation. The project enables the following studies and research projects: - Studying of effective rates of errors and reliability at highly refined levels and seeking means to identify and manage additional classes of errors (e.g., misdirected writes); - Creating semi-analytical models to allow tunable storage characteristics within a lifetime-reliability-performance cost space; - Running applications from data mining (including bioinformatics as drivers for proving the efficacy of the final system), to achieve new science in these data-intensive domains; and - Conducting computer science research aimed at simplifying use of active storage computation. Broader Impacts: This instrumentation increases the institution?s capacity to conduct cutting-edge research in an inexpensive, fast, practical, reliable petascale storage for data-intensive applications. Significant computational power logically close to that storage enables new science. Student training (including underrepresented groups) will be emphasized. The knowledge dissemination through this effort could be significant.