This project concerns a machine learning technique known as reinforcement learning, which is related to, but distinct from, the notion of reinforcement learning used in psychology. The common element is that both views study changes in behavior that result from experience. In the machine learning case, the behaviors are often decision making in dynamic environments, such as controlling a robot, a factory, inventory levels for a warehouse or even drug dosage levels. Current theoretical development in this area guarantees that optimal decisions can be made by reinforcement learning algorithms, but only under restrictive assumptions that are difficult to ensure in practice. Efforts to apply reinforcement learning to significant practical problems have enjoyed some success, but such efforts often forgo theoretical guarantees and rely upon tedious parameter adjustments by experts (human trial and error) to achieve success.

This research seeks to reduce the amount of human trial and error needed to make reinforcement learning successful, thereby making it a more accessible tool to a wider range of people. Specifically, it will focus on algorithms for domains described by continuous variables, seeking to provide stronger theoretical guarantees for such domains as well as an approach that balances the anticipated benefit of trying new things with the benefit of sticking to what is already known about a problem (exploration vs. exploitation). A practical benefit of success in this area would be improved techniques that make it easier for people to deploy algorithms that learn and improve performance in a variety of practical tasks like those mentioned above: robot or factory control, inventory management, or drug delivery.

This project plans to use a model helicopter as a challenge domain, but it is not about helicopter control per se. Rather, it seeks to develop general techniques that can apply to many problems, including helicopters, and will use model helicopters as an inexpensive and fun way to motivate students. The project aims to develop a model helicopter simulator (to reduce the cost and risk of trying everything on an actual helicopter) and plans to make this simulator available to the research community, providing a fun and challenging benchmark problem.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1218931
Program Officer
Weng-keen Wong
Project Start
Project End
Budget Start
2012-08-01
Budget End
2018-07-31
Support Year
Fiscal Year
2012
Total Cost
$458,000
Indirect Cost
Name
Duke University
Department
Type
DUNS #
City
Durham
State
NC
Country
United States
Zip Code
27705