The project aims to develop tools and methods, inspired by neurobiology, for addressing the need in building brain-like reinforcement learning systems and, ultimately, contributing to the understanding of brain functions. The project seeks to address fundamentally challenging issues arising from the robust optimal management of large complex systems subject to stochastic effects, nonlinearity, and dynamic uncertainties. Research findings from this project will contribute new solutions to emerging engineering applications such as the smart electricity grid, robotics, and intelligent transportation systems. The proposed research will have a substantial direct impact upon education at the PI's institution. The interdisciplinary nature of the project should appeal to students from several departments.
The project team will work on stochastic variants of adaptive dynamic programming (ADP) for continuous-time systems subject to stochastic and dynamic disturbances. ADP is a practically sound data-driven, non-model based approach for optimal control design in complex systems. ADP has been extensively studied for Markov decision processes, focusing mostly on discrete and finite state-space, and for deterministic (discrete- and continuous-time) dynamic systems. Stability and robustness issues in the presence of dynamic uncertainties are seldom addressed systematically. For problems involving complex modern engineering systems or biological systems, for which stability is an important concern, straightforward application of the existing ADP results does not seem productive or even likely to be successful. Hence, it is necessary to develop novel tools and methods for ADP design of general stochastic systems in continuous-time and continuous state-space, with rigorous stability and convergence analysis. The novelty of the proposed research consists of application and extension of techniques from reinforcement learning, stochastic systems theory, and nonlinear control theory. The specific goals of the proposal are the development of tools and methods for stochastic adaptive dynamic programming for linear and nonlinear stochastic systems, stochastic adaptive optimal control with robustness to dynamic uncertainties, and application to human motor systems. Rigorous stability proofs, convergence analysis of learning algorithms, and robustness analysis will be pursued. Important classes of continuous-time linear and nonlinear models with multiplicative and additive noise will be studied, along with non-model based, stochastic optimal controller designs. Beyond engineering applications, it is believed that bringing together ADP and research in computational neuroscience may yield new methodologies for the diagnosis and treatment of neurodegenerative genetic disorders that affect muscle coordination. One such medical condition is Parkinson's disease, which affects approximately seven million people globally, and one million in the United States. Generalizing the PI's recent work in linear stochastic variants of robust adaptive dynamic programming can lead to a potentially new computational mechanism for human motor control.