This project focuses on several aspects of automated feature discovery in the context of reinforcement learning. Badly chosen features cause reinforcement-learning algorithms to fail and, as such, only individuals skilled in feature construction can create successful reinforcement-learning systems for novel tasks. This issue underscores two shortcomings in existing research. First, most existing reinforcement-learning methods cannot generate or discover features automatically and robustly. Second, existing benchmark problems and paradigms for benchmarking do not distinguish adequately between clever algorithm design and clever feature engineering.

This project addresses these challenges in two-pronged approach. The first prong aims to advance a technical agenda leading to a new approach to feature discovery and model representation. The second prong is the development of a benchmark methodology and repository with a different focus and structure from existing endeavors. The goal for the benchmarking effort will be to produce a set of fair and reproducible experiments that will help elucidate the strengths and weaknesses of existing approaches, while simultaneously introducing challenges to motivate the development of new approaches.

Project Report

From collections of examples, computers can learn to distinguish passable from impassable roads, recognize hand-written characters, and predict the weather. Although many different kinds of these machine-learning algorithms have been developed, the critical element for successfully applying this technology to real-world problems is finding the right set of "features"---vectors of numeric attributes that describe the examples in a form machines can manipulate. The purpose of this project was to explore the problem of automatically identifying promising features for reinforcement-learning problems. Reinforcement learning is the task of forecasting the success of possible behaviors in sequential decision problems. For example, consider the problem of learning to play a game like backgammon. It is easy to learn to recognize legal moves, but a good player is one who can predict which of the immediately reachable board positions is most beneficial in the long run. Much like predicting the weather, there are subtle combinations of features that signal whether a future outcome is likely to be good or bad. Over the course of the project, we discovered some fairly deep connections between the problem of predicting how the world changes moment to moment leading ultimately to some outcome and the problem of predicting the outcome directly without imagining future world states. We showed that, in the case of basic linear predictions, both approaches are really two sides of the same coin. Indeed, many published algorithms are actually equivalent when viewed through this lens. By unifying disparate algorithms and strengthening our mathematical insights as to how they work, we were able to provide new approaches that systematically construct features that are likely to be the most useful in supporting linear predictions of future outcomes. In addition to the technical significance of our results and the promise they hold for future applications, the project engaged in a number of activities with potentially broad impacts outside the academic discipline. The funding helped support the development of a new introductory computer science class called "Great Insights in Computer Science". Unlike typical introductory computer science courses, this one attracted a statistically significantly larger number of under represented groups, who got the opportunity to learn about core ideas in computer archictecture, programming, and even machine learning. An offshoot of this class was an REU-supported project to create end-user programmable devices. The goal of this work has been to envision a future in which most people are familiar enough with programming that they can command their household appliances using their own simple algorithms. We built a collection of devices and have experimented with various programming interfaces. Our preliminary studies indicate that students are indeed motivated to learn more about programming when it is paired with physical devices. Our research on robotic learning has been the basis of presentations on Rutgers TV (campus wide news program), Rutgers Day (campus wide festival and open house), and to local middle school and elementary school students to help increase academic interest in science and technology. Our video of the demo won an award for best artificial intelligence video. Five graduate students participated in the project as part of their research training, several of whom have completed their studies and are involved in computer science research in industry.

Project Start
Project End
Budget Start
2007-10-01
Budget End
2012-05-31
Support Year
Fiscal Year
2007
Total Cost
$241,000
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
New Brunswick
State
NJ
Country
United States
Zip Code
08901