Twenty years ago, machine learning (ML) began a trajectory of theoretical/algorithmic improvements that have led to advanced materials for energy efficiency and molecular machines that synthesize molecules in ways unfathomable by the human hand. Those key advances were based upon a foundation of statistical methods that now mirror the field of topological data analysis (TDA) - which combines algebraic topology with computational methods to extract new knowledge by characterizing the global shape of data. Professor Clark at Washington State University, Professor Adam at Colorado State, Professor Pflaum at University Colorado Boulder, Professor Sundararaman at Rensselaer Polytechnic Institute, and Professor Zhang at University of Illinois Urbana Champagne are developing the Institute for Data-Intensive Research in Science and Engineering - Frameworks entitled "Descriptors of Energy Landscapes Using Topological Data Analysis" (DELTA). They are working on advancing TDA for the study of intensive and complex data sets found in Chemistry by focusing upon the development of methods and software tools that characterize the function that describes energy flow during chemical transformations, known as the energy landscape. Scalable and extensible TDA tools are used to extract new information from the energy landscape, understanding how it changes under different applied conditions and supporting a new paradigm in Chemistry, including the long-standing challenge of real-time optimization and control of chemical systems. At the intersection of Math, Data Science, and Chemistry, students trained under DELTA and its collaborative partners develop the skills and the foundation for a new community of practice.

Chemists generally do not know how the underlying energy landscape of transformation changes as a function of system conditions, nor are there quantifiable relationships between intra- and intermolecular interactions and its topological features. Topological data analysis (TDA) is uniquely poised to extract new information from the energy landscape (EL), as it combines algebraic topology with computational methods to characterize its global shape of data. The Descriptors of Energy Landscapes Using Topological Data Analysis (DELTA) Institute Frameworks adapts TDA for chemistry applications, invoking persistent homology, Morse theory, catastrophe theory, and other topological descriptors and creating new software tools that are accessible by domain experts. Tackling the 3N-dimensional energy surface necessitates scalable and extensible tools that first reduce its dimensionality (Objective 1), then yield geometric and topological descriptors that quantify the way in which the EL is perturbed under different chemical conditions (Objective 2). This provides the basis for new predictive methods that accelerate sampling of large regions of the EL and have have learned how to optimize landscape topology to control the fate of reacting molecules and phase behavior (Objective 3).

This project is part of the National Science Foundation's Harnessing the Data Revolution Big Idea activity. The effort is jointly funded by the Division of Chemistry within the NSF Directorate for Mathematical and Physical Sciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Application #
1934725
Program Officer
Rebecca Peebles
Project Start
Project End
Budget Start
2019-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2019
Total Cost
$1,600,000
Indirect Cost
Name
Washington State University
Department
Type
DUNS #
City
Pullman
State
WA
Country
United States
Zip Code
99164