Application workflows typically involve large-scale simulations, applications, and subsequent analysis, verification and validation. One of the most important requirements shared by various applications running on petascale systems is fast, portable, scalable I/O which is componentized, metadata rich and easy to use. The Adaptable I/O System (ADIOS) serves as a high-level I/O interface for an application to select which I/O libraries and formats to use without changing the application program. Codes such as GTC generate hundreds of TBs of data on hundreds of thousands of cores in a twenty four hour period. One must therefore optimize the I/O both for fast output in the generation phase and for fast input in the analysis phase. Both the writing and reading efficiency of I/O are critical for knowledge discovery. Development of a high level software infrastructure to allow optimization of I/O for entire workflows (including High-Performance I/O when reading data with different patterns) would greatly improve end-to-end performance in the knowledge discovery cycle. This project plans to develop efficient I/O methods which will enable application scientists to optimize data for writing, and which will be able to re-organize the data to obtain optimal performance for common reading patterns used by scientists. This project directly impacts the I/O performance of many petascale applications, including the GTC, GTS, XGC-1, Chimera, and S3D codes, and work directly with these teams to optimize the I/O in all stages of their scientific workflow.

Project Report

The main goals of this project was to research and develop an application-driven approach to enable large-scale Input/Ouput (I/O) that is fast, portable, scalable, metadata rich, and easy to use for scientific workflows. We focused on applications which read and write petabytes of data in a year, during both the checkpoint/restart phase of the large scale simulations, as well as for data that was used for post-processing. My students and postdocs were key members of our 2013 R&D 100 award for the Adaptable I/O System (ADIOS). We produced five papers from this project, one nominated for the best student paper at SC 2012, and one which one my student first place in the ACM Graduate Student research competition in 2012. The key to our work was in speeding up the I/O of many simulations and experiments, which is shown in (http://science.energy.gov/~media/ascr/images/ADIOS.jpg). For almost every simulation which we worked with, we were able to speed up the I/O by over 10X from other state-of-the-art techniques, including MPI-IO, pnetcdf, and parallel HDF5. In this project, we worked with many codes, including climate and weather codes used by NASA and Chinese researchers (GEOS5, GRAPES), and we were able to show that by laying out the data on parallel file systems using a Hilbert curve, over the space-time elements, we could dramatically reduce the speed of reading the data during analysis and visualization. Furthermore, we used machine learning techniques (Deterministic Annealing) to learn the I/O access patterns when users were visualizing their data, and then pre-fetch their data to eliminate much of the reading overhead during interactive visualization.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1003228
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2010-06-01
Budget End
2014-05-31
Support Year
Fiscal Year
2010
Total Cost
$299,863
Indirect Cost
Name
University of Tennessee Knoxville
Department
Type
DUNS #
City
Knoxville
State
TN
Country
United States
Zip Code
37916