Statistical learning is now central to natural language processing (NLP). Bridging the gap between learning and linguistic representation requires going beyond learning parameters. This CAREER project addresses three challenging, unresolved questions:

1. Given recent advances in learning the parameters of linguistic models and in approximate inference, how can the process of feature design be automated?

2. Given that NLP tasks are often defined without recourse to real applications and that a specific annotated dataset is unlikely to fulfill the needs of multiple NLP projects, can learning frameworks be extended to perform automatic task refinement, simplifying a linguistic analysis task to obtain more consistent, more precise, or faster performance?

3. Can computational models of language take into account the non-text context in which our linguistic data are embedded? Building on recent success in social text analysis and text-driven forecasting, this CAREER project seeks to exploit context to refine models of linguistic structure while enabling advances in this application area.

This basic research supports advances in a wide range of language engineering applications and discrete data analysis. In addition to core research advances, this CAREER project contributes a new publicly-available parser that models the most consistently learnable elements of syntactic struture. Educational activities include a new project-based on text-driven forecasting within the PI's undergraduate NLP course and a new undergraduate course in machine learning. It supports involvement by the PI in outreach activities to high school students and to a wider range of students at CMU by exposing aspects of his research in non-CS classrooms.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1054319
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2011-02-01
Budget End
2016-01-31
Support Year
Fiscal Year
2010
Total Cost
$565,812
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213