Reinforcement learning (RL) is one of the fastest-growing research areas in machine learning. RL-based techniques have led to several recent breakthroughs in artificial intelligence, such as beating human champions in the game of Go. The application of RL to real life problems, however, remains limited, even in areas where a large amount of data has already been collected. The crux of the problem is that most existing RL methods require an environment for the agent to interact with, but in real-life applications, it is rarely possible to have access to such an environment — deploying an algorithm that learns by trial-and-errors may have serious legal, ethical and safety issues. This project aims to address this conundrum by developing algorithms that learn from offline data. The outcome of the research could significantly reduce the overhead of using RL techniques in real-life sequential decision-making problems such as those in power transmission, personalized medicine, scientific discoveries, computer networking and public policy.

The project focuses on two settings that aim at addressing the aforementioned challenge of limited access to an environment. In the first setting, the agent is given only the historical data from logged interactions with the environment. In the second setting, the agent is able to change how it interacts with the environment only a few times. The investigators will develop mathematical theory that describes the difficulty of the problem and ensures that the developed algorithms are robust and optimal in the sense that they use the least possible resources (data, energy, computation). Using techniques such as marginalized importance sampling, uniform convergence and batched exploration, the project will generalize the recent line of work in ``breaking the curse of horizon'' to allow function approximations and establish the much-needed statistical learning theory for offline and low-adaptive reinforcement learning.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2020-10-01
Budget End
2023-09-30
Support Year
Fiscal Year
2020
Total Cost
$449,976
Indirect Cost
Name
University of California Santa Barbara
Department
Type
DUNS #
City
Santa Barbara
State
CA
Country
United States
Zip Code
93106