The objective of this research is to develop a new framework for robust adaptive/approximate dynamic programming to address grand challenges arising from engineering and biology, such as smart grid, brain research, robotics, and flight control. The approach is to take explicit advantages of versatile techniques from two active areas of research in reinforcement learning systems and neural networks and in modern nonlinear control.
Intellectual Merit This interdisciplinary research initiative, driven by the need in building brain-like reinforcement learning systems and in understanding ultimately the brain function, is significant in different aspects. It will significantly advance the state of the art on approximate dynamic programming and address the truly model-free situation. In addition, instead of building exact mathematical models, which often is very hard, if not impossible, for contemporary complex problems arising from engineering and biology, this proposal adopts a novel interconnected system viewpoint on the basis of the PI?s work on nonlinear small-gain theory.
Broader Impacts The proposed work will lead to the development of new tools for robust adaptive critic designs in interconnected complex systems. Not only these tools are expected to find applications in emerging engineering applications such as smart grid, robotics and flight control, but also they will help gain a deeper insight toward the long-term goal in understanding brain functions and building brain-like reinforcement learning engineering systems. The proposed research will have a substantial direct impact upon education at the PI's institution by engaging students from several areas and departments.
The field of adaptive/approximate dynamic programming (ADP) with diverse applications in engineering and biology has undergone rapid progress over the past few years. Over the last project year, the PI and his students continue to develop a new framework named under "robust adaptive dynamic programming" (for short, RADP) for the design of robust optimal controllers for continuous-time linear and nonlinear systems subject to both parametric and dynamic uncertainties. Applications to power systems and biological motor control have been considered from the new perspective of RADP, and novel significant results are obtained for these two exciting topics. With respect to the past literature of ADP, there has been very few research devoted to the consideration of dynamic uncertainty in physical and biological systems. Dynamic uncertainty arises from different contexts, such as model reduction, unmodeled dynamics and incomplete state information. This project has made significant contributions to the development of a robust variant of the existing ADP theory, by focusing mainly on continuous-time continuous state-space models. It is shown in a recent series of the work of PI that robust optimal control policies can be obtained via a recursive numerical algorithm using online information without solving the HJB equation (for nonlinear systems) and the algebraic Riccati equation (ARE) (for linear systems), even when the system dynamics are not precisely known. Robustness to dynamic uncertainty is guaranteed and analyzed rigorously using Lyapunov and small-gain methods. More recently, the PI and his students have filled up a gap in the past literature of ADP by studying continuous-time nonlinear systems subject to both parametric and dynamic uncertainties. Practical learning algorithms are developed, and have been applied to the controller design problems for a jet engine and a one-machine power system. This can be seen as a step toward bringing together two separate fields: ADP and applied nonlinear control. While previous work of ADP focuses on affine systems, the PI has obtained a new result on ADP for nonaffine systems that contain nonlinearly appearing control inputs. More interestingly, we have initiated a study of global ADP for nonlinear systems without using neural network approximations, but instead using techniques from semidefinite programming. This line of research aims to achieve siginificant computational improvement with respect to previous ADP algorithms based on neural network approximations. Last but not least we have shown by simulation and experimental data that RADP is an efficient computational mechanism for describing the human movement.