RI: Small: Towards Optimal and Adaptive Reinforcement Learning with Offline Data and Limited Adaptivity

Wang, Yu-Xiang

Abstract

Reinforcement learning (RL) is one of the fastest-growing research areas in machine learning. RL-based techniques have led to several recent breakthroughs in artificial intelligence, such as beating human champions in the game of Go. The application of RL to real life problems, however, remains limited, even in areas where a large amount of data has already been collected. The crux of the problem is that most existing RL methods require an environment for the agent to interact with, but in real-life applications, it is rarely possible to have access to such an environment â€” deploying an algorithm that learns by trial-and-errors may have serious legal, ethical and safety issues. This project aims to address this conundrum by developing algorithms that learn from offline data. The outcome of the research could significantly reduce the overhead of using RL techniques in real-life sequential decision-making problems such as those in power transmission, personalized medicine, scientific discoveries, computer networking and public policy.

The project focuses on two settings that aim at addressing the aforementioned challenge of limited access to an environment. In the first setting, the agent is given only the historical data from logged interactions with the environment. In the second setting, the agent is able to change how it interacts with the environment only a few times. The investigators will develop mathematical theory that describes the difficulty of the problem and ensures that the developed algorithms are robust and optimal in the sense that they use the least possible resources (data, energy, computation). Using techniques such as marginalized importance sampling, uniform convergence and batched exploration, the project will generalize the recent line of work in ``breaking the curse of horizon'' to allow function approximations and establish the much-needed statistical learning theory for offline and low-adaptive reinforcement learning.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 2007117
Program Officer: Rebecca Hwa

Project Start
Project End
Budget Start: 2020-10-01
Budget End: 2023-09-30
Support Year
Fiscal Year: 2020
Total Cost: $449,976
Indirect Cost

RI: Small: Towards Optimal and Adaptive Reinforcement Learning with Offline Data and Limited Adaptivity
Wang, Yu-Xiang
University of California Santa Barbara, Santa Barbara, CA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments