Therapeutic cells are being developed to treat chronic diseases. These cells are most often derived from stem cells. Two things must happen before this technology can be made available outside of clinical trials. First, the quality of the therapeutic cells must be improved. Second, the manufacture of these cells must be improved so all cells have a necessary level of therapeutic activity. Reproducibility in large scale manufacturing of therapeutic cells is the ultimate objective of this project. This project tests the hypothesis that active control will improve the reproducibility of therapeutic cell manufacture. This research will also improve student and public understanding of machine learning (ML) approaches through an interactive art exhibit. This work will enable practical manufacture of regenerative medicines as well as education of ML-based numerical methods and process control strategies for engineers. This project will train graduate students and 10 to 20 undergraduates with the skills required for tool design, cell culture, and reinforcement learning, thereby strengthening our nation’s biomanufacturing capabilities.
This project will develop a modular framework that can be applied to any pathway, and can be updated as new sensors are developed, stimulation cues are discovered, and new cell targets are identified. Three research aims will determine the impacts of dynamic control and RL training iteration number on differentiation improvement. These are (1) build the control framework, (2) enable training of the control framework, and (3) benchmark dynamic control against static recipes on model cells. Four model cell lines span three germ layer targets and have established chemical, physical, and electrical differentiation recipes. In the first aim, the reinforcement learning agent will be built on the TensorFlow package. Markov (memoryless) and non-Markov reward functions will be tested along with latest RL algorithms on an in silico simulator built from literature-based assumptions. In parallel, the second aim focuses on development of a real training environment that includes multiple sensing modalities along with three control elements (chemical, physical, and electrical). These will be parallelizable environments to allow for efficient training of the RL agent. In Aim 3, the RL-trained dynamic control strategy will be benchmarked against static recipes for intra- and inter- batch variability. These experiments will determine if such dynamic control improves manufacture consistency and will likewise provide average time needed for training a new differentiation route.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.