Over the last decade, research on discriminative learning methods like support vector machines (SVMs) and boosting has raised the state of the art in machine learning not only with respect to prediction accuracy, but also in terms of theoretical understanding and robustness. However, so far almost all of this research has been limited to problems of classification and regression. But what if the object we want to predict is not a single class or a real number or predict a complex object like a tree, a sequence, or a set of dependent labels? Such problems are ubiquitous, for example, in natural language parsing, information extraction, and text classification. The project will extend highly successful learning methods--in particular large-margin methods like support vector machines (SVMs)--to the problem of predicting such multivariate and interdependent outputs. In particular, this project will produce methods that can handle three types of dependencies: structure, correlation, and inductive dependencies. The intellectual merit of this project is the development of methods, their underlying theory, and efficient algorithms that can handle and exploit dependencies in complex outputs. Broader impact will come from applied work in several domains (e.g., bioinformatics, computational linguistics), as well as from making software implementations of the algorithms publicly available for teaching and research in applied fields.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0412894
Program Officer
Douglas H. Fisher
Project Start
Project End
Budget Start
2004-08-15
Budget End
2008-07-31
Support Year
Fiscal Year
2004
Total Cost
$270,000
Indirect Cost
Name
Cornell University
Department
Type
DUNS #
City
Ithaca
State
NY
Country
United States
Zip Code
14850