Emerging general-purpose high performance computing and storage resources provide great opportunities to tackle grand challenge problems in science and engineering. However, most systems do not have adequate support to meet special requirements of a number of ultra-scale dynamic scientific applications due to lack of application-awareness. These computation and data intensive applications are aimed to model and investigate highly dynamic and sometimes drastically changing phenomena in science and engineering, such as interacting black holes, global and regional high-resolution weather forecasting, combustion and detonation simulation, and many others. Furthermore, it is challenging and time-consuming for scientists/engineers to develop their large-scale parallel and distributed scientific applications from scratch.
This project fills the gap and aims to (1) create an integrated framework of scalable adaptive runtime management algorithms, libraries, and toolkit (called SMART) with friendly programming models so that scientists can write sequential programs to achieve automatic parallelism and high performance and throughput; (2) design a suite of application-aware adaptive algorithms to holistically address various issues in computation, communication, data, and energy management in systems with thousands of processors (such as clusters, grids, and clouds); and (3) enable high-impact real-world large-scale scientific applications with additional tools for simulation and visualization.