RI: Small: Scalable Algorithms for Learning to Recover Logical Form from Natural Language

Zettlemoyer, Luke

Abstract

The central hypothesis is that a new probabilistic learning approach for lexical generalization can simultaneous achieve the goals of (1) language-independent learning, (2) robustness when analyzing natural, unedited text, and (3) requiring reduced data annotation effort, in a computationally efficient manner that will scale to large learning problems. The approach under development induces a Combinatory Categorial Grammar (CCG), that is modified to replace the traditional, explicit list of lexical items in the lexicon with a distribution over lexical items that allows for significant generalization in the construction of possible syntactic and semantic structures for given input words. Modifying the CCG lexicon in this manner greatly increases the potential to generalize from the available training data without sacrificing the scalability that comes from working within an established grammar formalism for which efficient learning and parsing algorithms have been developed. This work will have impact at the algorithmic level and through applications, including advanced natural language interfaces to databases for non-technical users.

Project Report

A key aim in Natural Language Processing is to robustly map from natural language sentences to formal representations of their underlying meaning. Recent work has addressed this problem by learning semantic parsers given sentences paired with logical meaning representations. In this project, we developed new models and learning algorithms for recovering lexical structure, in the context of mapping sentences to logical form. The approach was inspired by linguistic theories of the lexicon, but directly motivated by the limitations observed in current, state-of-the-art learning algorithms. As proposed, we showed that a new probabilistic learning approach for lexical generalization can simultaneous achieve the goals of (1) language-independent learning, (2) robustness when analyzing natural, unedited text, and (3) requiring reduced data annotation effort, in a computationally efficient manner that will scale to large learning problems. The algorithm we designed induced a Combinatory Categorial Grammar (CCG), that was modified to replace the traditional, explicit list of lexical items in the lexicon with a distribution over lexical items that allows for significant generalization in the construction of possible syntactic and semantic structures for given input words. Modifying the CCG lexicon in this manner greatly increased the potential to generalize from the available training data without sacrificing the scalability that comes from working within an established grammar formalism for which efficient learning and parsing algorithms have been developed. This work had impact at the algorithmic level and through applications, including achieving state-of-the-art performance on a huber of benchmark datasets for evaluating advanced natural language interfaces to databases for non-technical users.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1115966
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2011-08-01
Budget End: 2014-07-31
Support Year
Fiscal Year: 2011
Total Cost: $300,000
Indirect Cost

RI: Small: Scalable Algorithms for Learning to Recover Logical Form from Natural Language
Zettlemoyer, Luke
University of Washington, Seattle, WA, United States

Abstract

Project Report

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Project Report

Funding Agency

Institution

Comments