Catalysts help make chemical reactions go faster and their development impact areas such as energy, the environment, biotechnology, and drug design. The vision of this project is to harness computational tools from modern statistics and machine learning to perform data-driven discovery of new catalysts. To this end, a collaborative team is assembled with the complementary expertise in catalysts, materials science, biophysics, computational modelling, statistics, signal processing, and data science. How a reaction is accelerated depends on the dynamic changes in the structure and shape of a catalyst and its associated chemical reactants (a catalytic system). The goal of this project is to explore, describe, and quantify the dynamic structures of enzyme and nanoparticle catalysts at the atomic level. Recent advances in microscopy and spectroscopy now make it possible to measure with great detail dynamic changes in time and in dimensional space. This project combines recent advances in data science with these new experimental tools to extract features that describe the dynamic behaviour of catalytic systems. In addition, the project will enhance the development of educational infrastructure for data-intensive and interdisciplinary science, contribute to workforce development, promote gender equality in the sciences, and disseminate scientific knowledge.

The guiding hypothesis of this research is that catalytic functionality cannot be fully understood without describing the atomic-level structural changes triggered by the molecular interactions of reactants with the catalyst. This hypothesis is tested by utilizing experimental datasets obtained from electron microscopy and single-molecule fluorescence resonance energy-transfer spectroscopy to explore structural dynamics in nanoparticles and enzymes. A data-analysis workflow, which integrates denoising, dimensionality reduction, clustering, and dynamic Markovian modelling, enables descriptions and classifications of the complex dynamical evolutions in spatiotemporally resolved measurements. The research develops and applies advanced methodologies to process noisy, high-dimensional data - a crucial bottleneck for the analysis of dynamic systems. The information extracted from experimental data guides the computational sampling of the conformational space of proteins and nanoparticles within a statistical physics framework, using supercomputer technology. This information facilitates the development of physical models that probe phenomena that are currently experimentally inaccessible, such as picosecond nuclear motions, as well as protein conformational changes and their coupling with chemical events. The transformative impact is to better understand catalysis by establishing a link between dynamic system response and catalytic functionality. The computational approaches developed through this project have the potential to be generally applied to many fundamental problems in materials science and structural biology where dynamic behaviours are important.

This project is part of the National Science Foundation's Harnessing the Data Revolution (HDR) Big Idea activity, and is jointly supported by the HDR and the Division of Chemistry within the NSF Directorate of Mathematical and Physical Sciences.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Application #
1940188
Program Officer
Pui Ho
Project Start
Project End
Budget Start
2019-10-01
Budget End
2021-09-30
Support Year
Fiscal Year
2019
Total Cost
$324,793
Indirect Cost
Name
University of Arkansas at Fayetteville
Department
Type
DUNS #
City
Fayetteville
State
AR
Country
United States
Zip Code
72702