Over the years, Machine Learning has become a very broad discipline with important applications to many areas including computer vision, speech recognition, robotics, and bio-surveillance, to name just a few. Moreover, many of these application areas have faced a huge increase in the volume of available data of various kinds. In order to be able to use all this data a number of new learning approaches have been proposed. These approaches have been intensely explored in the machine learning community, with many heuristics and specific algorithms, as well as experimental results reported. Unfortunately, however, the standard theoretical models do not capture the key issues involved in these learning techniques, and it has become clear that for developing robust, versatile, and general algorithms in these settings a more fundamental understanding is necessary. This project will develop theoretical foundations for such learning paradigms which are of significant practical importance but are not explained by existing theoretical models. This project will also develop fundamental new connections between Machine Learning, Game Theory, and Combinatorial Optimization, that will aid in advancing and solving important problems in all three areas.

The key research directions of this project are:

1. Developing mathematical foundations and algorithms for important machine learning paradigms that are not captured well by standard models. This includes new analysis frameworks as well as new practical and theoretically justified algorithms both for semi-supervised learning and active learning, two important emerging approaches for incorporating unlabeled data and interaction in the learning process. This project will also explore a fundamentally new approach to analyzing clustering -- a classic central task in the analysis and exploration of data, for which the existing theory has been very brittle. This framework will enable practitioners to describe in a formal way the properties they believe to be true about their data, and then use these properties to choose or design the right algorithm for their needs.

2. Developing novel fundamental connections between Machine Learning and Algorithmic Game Theory in order to solve difficult problems in multi-agent systems that have resisted previous approaches. In particular, many multi-agent interactions have bad equilibria, and it is important to develop methods for helping agents in such bad states to move to better ones. This project will develop techniques for understanding and influencing the behavior of natural dynamics in games of this type, by using connections to important concepts in Machine Learning, such as learning from untrusted experts' advice.

3. Developing fundamental connections between Machine Learning and Combinatorial Optimization in order to advance both areas. These include new connections between approximation algorithms and learning-based objectives for clustering, and new algorithms and computational theory for learning submodular functions. Submodular functions, which describe laws of diminishing returns, are ubiquitous in economic optimization problems, and methods for learning them from observed data can aid in designing improved decision procedures.

Altogether, this project will advance Machine Learning, Algorithmic Game Theory, and Combinatorial Optimization, by developing and exploiting novel connections between these areas. The theory developed by this work will enable the next generation of powerful algorithms for machine learning and multi-agent systems. It will additionally impact a wide range of application areas including computer vision, robotics, bio-surveillance, and online auction design. More broadly, the research results of this project will have impact across a number of scientific, medical, and industrial fields. The PI's education plan further contributes to the project's impact. In addition to advising a diverse set of students on projects directly related to this project, the progress in this research will be used to influence the curriculum via special courses presenting the theoretical advances along with their applications. The PI will also contribute to increasing the participation of women in computational sciences.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Application #
1451177
Program Officer
Jack S. Snoeyink
Project Start
Project End
Budget Start
2014-06-01
Budget End
2015-11-30
Support Year
Fiscal Year
2014
Total Cost
$289,279
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213