Natural language understanding, automatically computing the meaning of text, is key for allowing citizens to deal intelligently with the vast amount of digital information surrounding us, from the fine print on credit cards to science textbook chapters or online instructional material. The goal of this project is to develop systems that can build richer understandings of text than current systems. Humans have an incredible ability to integrate the structure of meaning --- how the meanings of sentences can be built up from the meanings of words --- with statistical knowledge about how words occur together with other words. Humans also effortlessly integrate meaning with 'reference', knowing which people or events in the world the text is talking about. But these tasks are quite difficult for computational systems. This project builds new computational models that integrate deep neural networks --- computational models with great power for representing word meaning in a statistical way --- with computational methods from logic and semantics. These new models allow word meanings to be combined together to build sentence meanings and also allow meanings to be linked with entities and events in the world. The resulting representations should help enable such societally important language understanding applications like question answering or tutorial software.

This project develops compositional forms of deep learning that bridge between lexical and compositional semantics. This includes new kinds of embeddings that can be used to perform better meaning composition, computing for example that a student with a plaster cast is similar to an injured person just as earlier embeddings computed that injured is similar to hurt, and extending the virtues (such as lexical coverage) of embeddings to represent the denotations of logical predicates. Another focus is enriching models of meaning with models of reference, building entity-based models that can resolve coreference in texts to handle problems like bridging anaphora or verb and event coreference, with algorithms for entity-based coreference based on tensors that capture similarity of reference rather than similarity of lexical meaning. And it includes developing vector space lexicons that represent both natural language dependency tree fragments and logical fragments in a shared vector space, and representing meaning as general programs that can model the effects of events and processes on resources in the world. The new models are brought to bear on the end-to-end task of learning semantic parsers that map text to a semantic denotation.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1514268
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2015-06-01
Budget End
2019-05-31
Support Year
Fiscal Year
2015
Total Cost
$1,100,000
Indirect Cost
Name
Stanford University
Department
Type
DUNS #
City
Stanford
State
CA
Country
United States
Zip Code
94305