Process Systems Engineering (PSE) has historically been advanced by using physics-based mathematical models and computer algorithms to design, optimize and control complex systems. Recent advances in the broad field of Data Science and Artificial Intelligence have led to a series of breakthroughs in the development of Machine Learning (ML) tools that can be used to derive mathematical models from large data sets. However, ML-based models can be limited in their ability to give insight into the physical origins of the system?s behavior and can result in poor predictions outside of the range of the original data set used to create the model. This CAREER project aims to develop hybrid modeling approaches that preserve what we already know about system behavior to make data-driven ML models more reliable leading to more accurate medical diagnoses, smarter autonomous vehicles, and safer chemical plants. Curriculum development activities are proposed aimed at introducing data science concepts into Chemical Engineering courses and high school statistics classes. Proposed outreach activities are aimed at increasing the number of female engineers in the field of PSE.

The proposed methodology aims at developing algorithms that will enable the simultaneous training of modern ML models (i.e., Neural Networks and Gaussian Process Models) with physical constraints that are derived from discretization of first-principles based models. The proposed research will involve a systematic study of pre-processing and integration of data, identification of low-dimensional descriptive feature spaces, hybridization of ML models with first-principles based models and quantification of the uncertainty of hybrid model predictions. The specific research aims are: (1) Theoretically advancing mathematical techniques for training nonparametric ML models to satisfy first-principles based model predictions (hybrid modeling); (2) Quantification of the uncertainty of hybrid models in the presence of noisy and incomplete data sets; (3) Algorithmic development for mixed-integer nonlinear optimization problems for design and synthesis with embedded hybrid models. A series of case studies that include production of pharmaceuticals, polymers and chemicals will be used to develop a benchmarking library for testing various hybrid modeling architectures. The design and optimization of bioprocesses will be studied using hybrid modeling to connect gene-level control to macroscale process optimization. A set of hybrid modeling and optimization teaching modules, suitable for incorporation within existing core courses of the Chemical Engineering curriculum, will be developed and broadly disseminated. Outreach activities are proposed that are aimed at introducing data-driven decision-making to high-school students through statistics classes and increasing gender diversity in the field of PSE. A high-school teacher will be hosted by the Principal Investigator in collaboration with the Center for Education Integrating Science, Mathematics, and Computing (CEISMC)at Georgia Tech and the Georgia Intern Fellowships for Teachers (GIFT) program, which provides paid summer STEM internships in industry workplaces and University laboratories for K-12 science, mathematics, and technology teachers.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Georgia Tech Research Corporation
United States
Zip Code