Cecilia Clementi of Rice University is supported by an award from the Chemical Theory, Models and Computational Methods program in the Division of Chemistry to develop multiscale models for macromolecular systems. Professor Clementi and her group are developing machine learning tools to combine the results from microscopic simulation and experimental data into a data-driven modeling framework. The last several years have seen an immense increase in high-throughput and high-resolution technologies for experimental observation. These advances are combined with high-performance techniques to simulate molecular systems at a microscopic level resulting in vast and ever-increasing amounts of data. Professor Clementi is taking advantage of this abundance of data and uses machine learning to extract information, in order to formulate general principles regulating the behavior of molecular systems. Understanding chemical processes at the molecular level is essential for a large number of applications, from energy storage to drug design. Additionally, as the need to represent massive data sets in terms of a model bears similarity across different fields, Professor Clementi's work may have an impact on a broad range of completely different disciplines from genomics to finance. This research impacts an interdisciplinary community of students and researchers. Her project includes the development of undergraduate and graduate courses, and outreach activities focused in the recruiting and mentoring of minority students, especially through collaboration with the Tapia Center at Rice University.

Professor Clementi is developing a data-driven framework to design effective molecular models at multiple resolutions, to address questions currently out of reach to existing computational and experimental approaches. The main idea is to use state-of-the-art machine learning methods to "learn" the coarse-grained dynamical models governing molecular systems (structure, thermodynamics, and kinetics/mechanism) at the mesoscale, by combining simulation data generated from microscopic simulation, and experimental data. By integrating different sources of data, this modeling approach reconciles bottom-up and top-down methods. This approach generates functional building blocks that can be embedded in higher-order simulations in order to bridge the gap to macroscopic systems. This modeling framework may serve as a keystone to integrate vast amounts of chemical data into quantitative, mechanistic and comprehensible models. Such models may explain how different molecular components organize and interact as a function of time and space in performing functions at the macroscopic scale. In particular, the developed framework is applied to investigate one specific biomolecular process: the binding of peptides to Major Histocompatability Complex (MHC) proteins.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Chemistry (CHE)
Type
Standard Grant (Standard)
Application #
1900374
Program Officer
Michel Dupuis
Project Start
Project End
Budget Start
2019-05-01
Budget End
2022-04-30
Support Year
Fiscal Year
2019
Total Cost
$510,000
Indirect Cost
Name
Rice University
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005