In reinforcement learning (RL), autonomous agents (such as disaster recovery robots, self-driving cars or unmanned aerial vehicles) must simultaneously learn about an unknown environment while acting in that environment. Deep reinforcement learning is a variant of RL that leverages the power of deep neural networks (DNNs) to learn both to extract information from sensors and how to transform that information into optimal actions. The combination of RL and deep learning has generated impressive advances, but it is expensive, both in terms of data and in terms of computation: typical agents only learn after tens or hundreds of millions of interactions with an environment, making most algorithms unusable in anything but a fast simulator. This stands in stark contrast to humans attempting the same tasks, who can perform well after only a few minutes of practice (or even merely watching another human practice for a few minutes).

By its nature, this type of machine learning research builds tools for others to use. To connect these tools with disciplines outside of theoretical computer science and to improve outreach and education, this work integrates a new student exchange program, intended to both export results and to import technical challenges from other fields. The plan is rounded out with a mix of undergraduate research opportunities, competitions, and project-oriented classes focused on real-world systems and data -- all intended to spark excitement and communicate a hope for a better world through improved technology.

This work seeks to improve deep RL by addressing two fundamental issues: first, how to reduce the amount of data and computation needed by deep RL, and second, how to improve deep RL's ability to solve complex tasks by incorporating model-based prior knowledge. The technical strategy builds on ideas from cognitive science, mimicing three key human cognitive capabilities lacking in current deep RL algorithms: (1) humans' native ability to build models of the world, which allows them to (2) transfer knowledge from previous experience via abstraction, and (3) reason explicitly about their own uncertainty.

To accomplish this, this work combines the strengths of two frameworks: deep neural networks and Bayesian models. The DNNs provide low-level signal processing, flexible and learnable model components, and powerful building blocks for discriminative inference, while the Bayesian models provide high-level reasoning about objects, causality, and theory of mind in a coherent probabilistic framework that can deal explicitly with uncertainty. These capabilities are delivered by improved probabilistic programming frameworks that enable both the necessary models and algorithms: probabilistic programming permits the construction complex probabilistic models, and provides natural opportunities for integration with DNNs through automated inference compilers.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1652950
Program Officer
Weng-keen Wong
Project Start
Project End
Budget Start
2017-03-01
Budget End
2022-02-28
Support Year
Fiscal Year
2016
Total Cost
$211,952
Indirect Cost
Name
Brigham Young University
Department
Type
DUNS #
City
Provo
State
UT
Country
United States
Zip Code
84602