Semantic Entity and Relation Extraction from Web-scale Text Document Collections

Collins, Michael

Abstract

This project addresses current limitations in automatic information extraction technology. Specific objectives are to: 1. use bootstrapping techniques to greatly increase the number of types of entities and relations that can be extracted and the rate at which one is able to create new extractors, 2. improve the performance of supervised training for entity and relation extractors by using bootstrapping to add additional training features and by applying new supervised learning techniques, including new perceptron and discriminative training techniques, 3. address meta-data issues of provenance, confidence, and temporal extent of facts, focussing particularly on the construction of a model of the expected lifetime of facts based on a longitudinal corpus of Web data.

The outcome of the project will be scientific understanding and technology for automatic information extraction from free text, making it possible to convert large document collections into formal databases suitable for automated processing. This will represent a significant enhancement in the utility and societal benefit of digital libraries and the World Wide Web. Project results will be disseminated in the form of publications and publicly available code for information extraction and learning of extractors.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 0308370
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2003-06-15
Budget End: 2006-05-31
Support Year
Fiscal Year: 2003
Total Cost: $185,241
Indirect Cost

Semantic Entity and Relation Extraction from Web-scale Text Document Collections
Collins, Michael
Massachusetts Institute of Technology, Cambridge, MA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments