A fundamental challenge in machine learning is the design of computational agents that, rather than being explicitly programmed, autonomously learn complex tasks in stochastic real-world environments. Past approaches, such as reinforcement learning algorithms for solving Markov decision processes, scale poorly to large state spaces. The proposed research addresses this curse of dimensionality by investigating a novel framework combining reinforcement learning and online convex optimization, in particular mirror descent and related algorithms. Mirror descent scales significantly better than classical first-order gradient descent in high-dimensional state spaces, by using a distance-generating function specific to a particular state space geometry.

The proposed framework enables several significant algorithmic advances in the design of autonomous machine learning agents: a new class of first-order mirror-descent based methods for learning sparse solutions to Markov decision processes will be developed that scale significantly significantly better than previous second-order methods; novel hierarchical methods for solving semi-Markov decision processes will be investigated; and finally, applications to a variety of high-dimensional Markov decision processes will be explored.

The anticipated outcomes of the proposed work include foundational advances in designing autonomous agents that learn to solve sequential decision-making problems, which will impact a large number of target applications from manufacturing to robotics and scheduling. The educational goal includes the development of a graduate-level course in online convex optimization for sequential decision-making, as well as interdisciplinary tutorials to enhance the cross-fertilization of ideas from applied mathematics and optimization to machine learning and artificial intelligence.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1216467
Program Officer
Weng-keen Wong
Project Start
Project End
Budget Start
2012-08-01
Budget End
2016-07-31
Support Year
Fiscal Year
2012
Total Cost
$449,991
Indirect Cost
Name
University of Massachusetts Amherst
Department
Type
DUNS #
City
Hadley
State
MA
Country
United States
Zip Code
01035