A Scalable Data Management Abstraction for Large-scale Coupled Simulation Workflows

Coupled scientific simulation workflows, integrating multiple physics and scales and running at very large scales on high-end resources, have the potential for achieving unprecedented levels of accuracy and providing dramatic insights into complex phenomena. However, the coupled component of these simulation workflows need to interact and exchange significant amounts of data at runtime, and the data often has to be transformed as it flows from source to destination. As the volumes and generation rates of this data grow, the costs (latencies and energy) associated with extracting this data and transporting it for coupling, transformation and analysis have become the dominating overheads and are dictating the level of performance and productivity that can be achieved.

The goal of this project is to address these challenges and to develop conceptual solutions as well as a software framework that can enable the large-scale data-intensive simulations. Our approach is based on the premise that given the large data volumes and associated costs, data will have to be largely processed online, ?in-situ? and ?in-transit? while it is staged using resources within the computational platform, and the programming and runtime system must provide abstractions and mechanisms that facilitate such data processing. Our effort is organized around three key research thrusts: (1) Programming abstractions for in-situ/in-transit data management; (2) Design and implementation of a scalable data staging substrate; and (3) Data-centric mapping and scheduling.

Data and compute intensive simulations are becoming increasingly critical to a wide range of science and engineering domains, and as a result, this research has the potential to drive research and innovations in these domains. The developed framework and benchmarks also provide computer scientists with a substrate to experiment with and explore data-centric research. The development of human resources, including the training of students, researchers and software professions, as well as outreach to minorities and underrepresented group, is integral to all aspects of this effort.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1310283
Program Officer
Robert Chadduck
Project Start
Project End
Budget Start
2013-05-01
Budget End
2017-04-30
Support Year
Fiscal Year
2013
Total Cost
$559,283
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
Piscataway
State
NJ
Country
United States
Zip Code
08854