The Petascale Active Data Store (PADS) will be a petabyte (1015-byte)-scale online storage server capable of sustained multi-gigabyte/s I/O performance, tightly integrated with a 9 teraflop/s computing resource and multi-gigabit/s local and wide area networks. Its hardware and associated software will enable the reliable storage of, access to, and analysis of massive datasets by both local users and the national scientific community.
The PADS design results from a study of the storage and analysis requirements of participating groups in astrophysics and astronomy, computer science, economics, evolutionary and organismal biology, geosciences, high-energy physics, linguistics, materials science, neuroscience, psychology, and sociology. For these groups, PADS represents a significant opportunity to look at their data in new ways, enabling new scientific insights. The infrastructure also will encourage new collaborations across disciplines. PADS is also a vehicle for computer science research into active data store systems, and will provide rich data on which to investigate new techniques. Results will be made available as open source software.