The NSF Center for Computer Assisted Synthesis (C-CAS) is supported by the Centers for Chemical Innovation (CCI) Program of the Division of Chemistry. This Phase I Center is led by Olaf Wiest of the University of Notre Dame. Other team members include Nitesh Chawla, also of the University of Notre Dame, Abigail Doyle of Princeton University, Robert Paton of Colorado State University, Richmond Sarpong of the University of California, Berkeley, and Matthew Sigman of the University of Utah. The goal of C-CAS is to combine data science, machine learning, artificial intelligence, chemical reactions and selectivity optimization, computational chemistry, and organic synthesis to transform how the synthesis of complex organic molecules is planned and executed. This center will transform chemical synthesis and increase the economic and societal competitiveness of pharmaceutical, chemical, and technology industries within the US. The research efforts are enhanced through multiple industrial partnerships. As a result, a new generation of interdisciplinary teams of "data chemists" and machine learning scholars will be trained to address the challenges and demands of modern synthetic chemistry. A network of research, networking and professional development opportunities will be established to bridge the gap between disciplines, drawing students from a broad range of backgrounds including students with disabilities. Social and mass media will be utilized to engage the general public.

The vision of C-CAS is to create a comprehensive platform for computationally planning and optimizing synthetic pathways, predicting reaction performance, and addressing the selectivity (e.g., chemo-, regio, stereo-) challenges inherent in complex molecule synthesis. This center has three thrusts. The first thrust is to use representation learning to unify heterogeneous data from a variety of sources, including unbiased, dense, and "clean" microscopic and macroscopic data from the literature, patents, high-throughput experimentation, electronic laboratory notebooks and high-throughput computation. The second thrust is to exploit this unified data representation to address the "over-the arrow" problem of chemical reaction optimization that is the rate-limiting step in most syntheses, through active and transfer learning that will yield interpretable and explainable statistical and machine learning models. The third thrust is to demonstrate the combined use of heterogeneous data and optimization algorithms with existing synthesis planning programs to score possible synthetic routes and provide synthetic chemists with go/no-go decisions in the synthesis of a complex molecule. C-CAS provides quantitative training to a new generation of "data chemists" and to machine learning researchers, thereby equipping them for participation in interdisciplinary science, including industry-academic collaborations. The Center delivers programming such as online and in-person workshops, networking, and formulation and sharing of best practices in this rapidly evolving field that enhances the interdisciplinary experience and societal impact. The data science and computational portion in C-CAS provides non-traditional research and networking opportunities that are particularly suitable to students with disabilities and underrepresented minority students. C-CAS engages non-scientists in discussions of the integration of chemistry and machine learning through mass- and social media as well as in-person communications.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Chemistry (CHE)
Type
Standard Grant (Standard)
Application #
1925607
Program Officer
Michelle Bushey
Project Start
Project End
Budget Start
2019-09-01
Budget End
2022-08-31
Support Year
Fiscal Year
2019
Total Cost
$1,800,000
Indirect Cost
Name
University of Notre Dame
Department
Type
DUNS #
City
Notre Dame
State
IN
Country
United States
Zip Code
46556