Project Report

Markov decision processes (MDPs) are a common formalization of sequential decision problems. One important type of sequential decision problem is that of controlling a robot. At each moment, the robot must observe the state of the world around it and select an action to take. When the robot achieves a desired task, it is given a reward. The goal of the robot is to maximize the amount of reward that it receives. Methods for solving MDPs (finding good decision rules) have become quite powerful recently. However, they struggle when the action space is high-dimensional. A high-dimensional action space occurs when the robot has many "muscles" for which it must select an action at each moment. In the case of robots, these are actuators and not really muscles. In this work, we derived inspiration from how animals use motor primitives to control large numbers of muscles. Rather than selecting an action for each muscle individually, animals select actions for a small number of motor primitives, each of which results in a pattern of activation over all of the actual muscles. If there are significantly fewer motor primitives than muscles, this simplifies the problem from a high-dimensional one (selecting activation levels for many muscles) to a low-dimensional one (selecting activation levels for a few motor primitives). In our research, we developed a method for automatically searching for the motor primitives that will perform best for a system (e.g. a particular robot). Once these motor primitives have been found, learning new tasks becomes easier. We performed experiments using a simulated arm with 18 to 36 muscles, and found that we could control it well using only two motor primitives, though learning did best with four motor primitives. Hence, we compressed the action-space from an 18 to 36-dimensional one down to a 2 to 4-dimensional one. We then showed that learning did indeed progress much faster in this lower-dimensional space.

National Science Foundation (NSF)
Office of International and Integrative Activities (IIA)
Application #
Program Officer
Carter Kimsey
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Thomas Philip S
United States
Zip Code