Reinforcement learning (RL) applies to any task that involves an agent taking a sequence of actions where the effects of one action influence the long-term utility of subsequent actions. How should such a learning agent represent its knowledge about its environment? Traditional models typically capture the agent's state as composed of objects and events in the environment and relations among them. Since these relations cannot be directly observed by the agent through its sensors, they have meaning only in the mind of the human designer of the agent. Recently, the PI and colleagues have instead proposed modeling the agent's state as composed of a set of predictions of observable outcomes of tests or experiments that the agent could perform in its environment. Such representations, called predictive state representations (PSRs), are composed entirely of observable quantities and therein lies much of their promise for efficient and scalable planning and learning in RL tasks. In this project many foundational questions about PSRs are being explored. These include: (1) How can an agent discover what predictions it should keep to capture the state of its environment?; (2) How can the long-term action-conditional predictions that are part of PSR state representations be used to speed up planning that is about evaluating long-term effects of actions?; (3) How can memory of past observations be combined with PSR predictions of future observations for computational benefit?; and (4) How can the flexible temporally abstract representations of state -- specifically PSRs -- be combined with similarly temporally abstract representations of actions? This project is developing the nascent idea of PSRs into a full-fledged theory of learning and planning. If successful, this research will result in a dramatic increase in the applicability of RL for building learning agents in large-scale domains in AI, operations research, control, and dynamical systems. This project also plans to construct and make publically available a set of benchmark RL tasks. This will help remediate the lack of such widely available test beds in the RL community.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0413004
Program Officer
Jie Yang
Project Start
Project End
Budget Start
2005-02-15
Budget End
2010-01-31
Support Year
Fiscal Year
2004
Total Cost
$274,992
Indirect Cost
Name
University of Michigan Ann Arbor
Department
Type
DUNS #
City
Ann Arbor
State
MI
Country
United States
Zip Code
48109