AF: Small: Parallel Reinforcement Learning with Communication and Adaptivity Constraints

Zhou, Yuan

Abstract

Reinforcement learning has witnessed great research advancement in recent years and achieved successes in many practical applications. However, reinforcement-learning algorithms also have the reputation for being data- and computation-hungry for large-scale applications. This project will address this issue by studying the important question of how to make reinforcement-learning algorithms scalable via introducing multiple learning agents and allowing them to collect data and learn optimal strategies collaboratively. The outcomes of this project will have impacts on numerous areas where reinforcement learning is used at a scale, e.g., multi-phase clinical trials, training autonomous-driving algorithms, crowdsourcing tasks, pricing, and assortment optimization for stores at different locations. The research products will be disseminated via talks at academic conferences and workshops, universities, industrial labs, and online media, and will also be integrated in two courses on the forefront of reinforcement learning and big-data algorithms.

More technically, this project will study how to address the fundamental constraints on communication and adaptivity for the learning agents. In particular, this project will investigate a handful of collaborative learning models, including full communication, synchronized communication, synchronized communication with limited adaptivity, and asynchronized communication, and study the following general questions: (1) what is the fundamental advantage of allowing adaptivity in the parallel learning model; (2) are there inherent differences on the degree of parallelism between model-based and model-free reinforcement learning; (3) what is the impact of asynchronized communication; and (4) is it possible to communication-efficiently parallelize general algorithmic techniques in reinforcement learning? The team of researchers will address these questions by studying a set of core problems, including best arm(s) identification and regret minimization in multi-armed bandits, contextual bandits, finite-state Markov decision process (MDP) learning, reinforcement learning with function approximates, and coordinated exploration in MDPs. Through studying these questions, this project will bring new techniques, perspectives, and insight to communication-efficient parallel reinforcement learning. This project will also have a significant impact on a number of related research areas such as control theory, operations research, information theory and communication complexity, and multi-agent systems.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Computer and Communication Foundations (CCF)
Type: Standard Grant (Standard)
Application #: 2006526
Program Officer: Tracy Kimbrel

Project Start
Project End
Budget Start: 2020-10-01
Budget End: 2023-09-30
Support Year
Fiscal Year: 2020
Total Cost: $257,745
Indirect Cost

AF: Small: Parallel Reinforcement Learning with Communication and Adaptivity Constraints
Zhou, Yuan
University of Illinois Urbana-Champaign, Champaign, IL, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments