Many high-end engineering and scientific applications routinely employ advanced cyber infrastructure (CI). CI is formed of a combination of high performance computing (HPC) systems, software, application developers, and application users. Among these high-end applications, a growing number require repeated runs on HPC systems that are not designed optimally for their executions. These challenges are further exacerbated by the continuously changing hardware landscape. Faced with these challenges how is a CI application developer expected to develop and deploy applications in an efficient and sustainable manner? This is the central research question that this project seeks to address. The overarching goal is to develop a systematic and structured way to explore design spaces of CI configurations using machine learning techniques, and to demonstrate value in application and discovery potential through real-world applications. Other project activities integrate and leverage upon the research outcomes of this project, while preparing the next generation scientific workforce. The project is also leading to the development of curricular modules in parallel algorithms/applications and machine learning, and conference tutorials for broader outreach. The project will lead to the training of two PhD students in performing interdisciplinary research.

This project lays the foundations for a novel computational framework referred as Sust-CI that enables the developers to design and optimize cyber infrastructures for efficiency. This framework synergistically combines algorithmic abstractions, programming tools, and machine learning techniques to enable adaptive cyber infrastructures. This approach will automatically learn policies to make design decisions to optimize an objective specified by the developer (e.g., performance) in a data-driven manner. The project is leading to the development of sample-efficient machine learning algorithms for CI design space exploration and optimization. The key idea is to provide advanced CI applications a new capability to derive knowledge by exploring different execution traces (computational behavior) on the given training problem instances. The research will lead to a first-of-its-kind design space exploration framework to enable a sustainable use of CI resources toward leadership applications in science and engineering.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1910213
Program Officer
Seung-Jong Park
Project Start
Project End
Budget Start
2019-05-01
Budget End
2022-04-30
Support Year
Fiscal Year
2019
Total Cost
$499,998
Indirect Cost
Name
Washington State University
Department
Type
DUNS #
City
Pullman
State
WA
Country
United States
Zip Code
99164