We propose to generalize and certify the performance of reinforcement learning algorithms for control of cyber-physical systems (CPS). Broadly speaking, reinforcement learning applied to physical systems is concerned with making predictions from data to control the system to extremize a performance criterion. The project will particularly focus on developing theory and algorithms applicable to hybrid and multi-agent control systems, that is, systems with continuous and discrete elements and systems with multiple decision-making agents, which are ubiquitous in CPS across spatiotemporal scales and application domains. Reinforcement learning algorithms are not yet mature enough to guarantee performance when applied to control of CPS. In light of these limitations, this project aims to lay the theoretical and computational foundation to certify reinforcement learning algorithms so that they may be deployed in society with high confidence.
This project will certify reinforcement learning algorithms that compute optimal control policies in systems with non-classical dynamics and non-classical costs. To achieve this goal, we will generalize convergent algorithms originally designed for purely continuous systems to apply in hybrid control systems whose states undergo a mixture of discrete and continuous transitions. Moreover, we specifically aim to ensure this approach is applicable to societal-scale CPS in which multiple agents, some of which may be humans, interact directly with the CPS. These algorithms will be experimentally validated on three testbeds that represent a range of hybrid and multi-agent phenomena that arise in CPS. The first testbed will test the performance of our algorithms on societal-scale traffic flow networks via simulation. The second testbed will consider heterogeneous teams of aerial and terrestrial mobile robots collaborating with human partners to perform construction, inspection, and maintenance tasks on scale facsimiles of infrastructure like bridges, and tunnels. The third testbed will study the closed-loop interaction between individual humans and remote, teleoperated robots that perform dynamic locomotion and manipulation behaviors. This project will also co-organize an interdisciplinary workshop with technology policy experts, the results of which will form the basis for an interdisciplinary multi-campus graduate-level seminar run by the PIs.
This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.