RI: Small: Non-parametric Approximate Dynamic Programming for Continuous Domains

Parr, Ronald

Abstract

This project concerns a machine learning technique known as reinforcement learning, which is related to, but distinct from, the notion of reinforcement learning used in psychology. The common element is that both views study changes in behavior that result from experience. In the machine learning case, the behaviors are often decision making in dynamic environments, such as controlling a robot, a factory, inventory levels for a warehouse or even drug dosage levels. Current theoretical development in this area guarantees that optimal decisions can be made by reinforcement learning algorithms, but only under restrictive assumptions that are difficult to ensure in practice. Efforts to apply reinforcement learning to significant practical problems have enjoyed some success, but such efforts often forgo theoretical guarantees and rely upon tedious parameter adjustments by experts (human trial and error) to achieve success.

This research seeks to reduce the amount of human trial and error needed to make reinforcement learning successful, thereby making it a more accessible tool to a wider range of people. Specifically, it will focus on algorithms for domains described by continuous variables, seeking to provide stronger theoretical guarantees for such domains as well as an approach that balances the anticipated benefit of trying new things with the benefit of sticking to what is already known about a problem (exploration vs. exploitation). A practical benefit of success in this area would be improved techniques that make it easier for people to deploy algorithms that learn and improve performance in a variety of practical tasks like those mentioned above: robot or factory control, inventory management, or drug delivery.

This project plans to use a model helicopter as a challenge domain, but it is not about helicopter control per se. Rather, it seeks to develop general techniques that can apply to many problems, including helicopters, and will use model helicopters as an inexpensive and fun way to motivate students. The project aims to develop a model helicopter simulator (to reduce the cost and risk of trying everything on an actual helicopter) and plans to make this simulator available to the research community, providing a fun and challenging benchmark problem.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1218931
Program Officer: Weng-keen Wong

Project Start
Project End
Budget Start: 2012-08-01
Budget End: 2018-07-31
Support Year
Fiscal Year: 2012
Total Cost: $458,000
Indirect Cost

RI: Small: Non-parametric Approximate Dynamic Programming for Continuous Domains
Parr, Ronald
Duke University, Durham, NC, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments