This research involves developing new connections between the areas of machine learning, on-line algorithms, and optimization, and using these connections to address fundamental problems in all three areas.

In machine learning, one focus of this research is how to best combine a small sample of labeled data with a large amount of unlabeled data in order to produce high quality predictions. This type of problem has become especially important given the explosion of data now available over the web. This research is exploring how techniques such as network flow and graph cuts from the optimization literature can be used to provide a new means of attack, and how to best make a number of design choices that arise in this approach. Another basic question this work investigates is what kinds of concepts can be automatically learned in the presence of highly noisy data, and to what extent substantially new types of algorithms may be possible. The PI recently gave the first algorithm to learn a class of concepts in the presence of noise that is provably not learnable by a wide class of techniques known as Statistical Query methods. The current research aims to expand on this work and explore the extent to which it can be pushed much further. The types of learning problems being studied have close connections to problems of decoding random linear codes and finding approximate shortest lattice vectors that arise in cryptography. Improvements to the learning algorithms should impact our understanding of those problems as well.

Another major thrust of this research is the use of techniques from machine learning to address problems in online algorithms. In particular, the highly-developed "weighted experts" technology in machine learning suggests new approaches for combining multiple online algorithms that may simplify a number of longstanding open problems. This work is studying the extent to which this connection can provide insight into several basic questions, such as dynamic optimality in search trees and the weighted caching problem. Finally, this research is also studying a number of basic approximability questions, as well as exploring new frameworks for the analysis of local search techniques.

Project Start
Project End
Budget Start
2001-09-01
Budget End
2004-08-31
Support Year
Fiscal Year
2001
Total Cost
$285,990
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213