Molecular simulations are becoming important tools in understanding nanoscale processes in science and engineering. Such processes include the motions of proteins and nucleic acids that will enable design of better drugs, the interactions of liquids and metals in photovoltaic and catalytic applications, and the behavior of complex polymers used in industrial materials. Although national cyberinfrastructure investments are increasing raw computational power, the molecular timescales that scientists can simulate are not increasing proportionately. This means that most simulations are significantly shorter than the physical processes they are designed to study. Fortunately, many researchers have developed powerful algorithms that combine multiple simulations to overcome this molecular timescale problem, but these algorithms can still be very difficult to use effectively. This project, called SCALE-MS, will develop computing tools to simplify the process of writing algorithms that use large collections of molecular simulations to simulate the long timescales needed for scientific and industrial understanding. These tools will make it much simpler to have simulations interact adaptively, so simulation results can automatically guide the creation and running of new simulations. By making these complex multi-simulation algorithms easier to create and run, this project will enable users to run existing methods in computational molecular science more easily and make it possible for researchers to create and test new, even more powerful, methods for molecular modeling. This project also brings together researchers from biophysics, chemical engineering, and materials science, combining expertise from multiple simulation fields to develop important new ensemble simulation algorithms. This adaptive ensemble framework will enable communities of molecular simulation users in chemistry, chemical engineering, materials science, and biophysics to more easily exchange advanced methods and best practices. Many aspects of this framework can also be applied to aid societal problems requiring modeling in other domains, such as climate and earthquake modeling and prediction.

This project addresses a fundamental need across molecular simulation communities from chemistry to biophysics to materials science: the ability to easily simulate long-timescale phenomena and slowly equilibrating ensembles. Researchers are increasingly developing high-level parallel algorithms that utilize simulation ensembles, loosely coupled molecular simulations that exchange information on a slower time scale than standard parallel computing techniques. However, most existing molecular simulation software cannot express ensemble simulation algorithms in a general manner and execute them at scale. There is thus a need for (i) the ability to express ensemble-based methods in a simple, easy- to-use manner that is agnostic of the underlying simulation code, (ii) support for adaptive and asynchronous execution of ensembles, and (iii) a scalable runtime system that encapsulates the complexity of executing and managing jobs seamlessly on different resources. The project will develop an extensible framework, including a simple high-level API and a sophisticated runtime system, to meet these design objectives on NSF?s production cyberinfrastructure. A key element of this design is the ability to specify ensemble-based patterns of work- and data-flow in a fashion independent of the challenges and complexity of the runtime management of the ensembles. This project will develop a framework consisting of a simple adaptive ensemble API with an underlying runtime platform that enables expression of ensemble simulation methods in a fashion agnostic of the underlying simulation code. This will facilitate design of new ensemble-based methods by the community and enable scientific end users to simply encode complex adaptive workflows. This approach separates the complexity of compute job management from the expression of sophisticated methods. The framework will support adaptive and asynchronous execution of ensembles, removing synchronization blocks that have restricted peta- and exa-scaling of simulation methods.

This award by the Office of Advanced Cyberinfrastructure is jointly supported by the Division of Chemistry within the NSF Directorate for Mathematical and Physical Sciences and the Division of Chemical, Bioengineering, Environmental, and Transport Systems within the NSF Directorate for Engineering.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1835607
Program Officer
Bogdan Mihaila
Project Start
Project End
Budget Start
2019-01-01
Budget End
2021-12-31
Support Year
Fiscal Year
2018
Total Cost
$365,621
Indirect Cost
Name
Pennsylvania State University
Department
Type
DUNS #
City
University Park
State
PA
Country
United States
Zip Code
16802