Workshop Proposal: Content of Linguistic Annotation: Standards and Practices (CLASP)

Meyers, Adam

Abstract

At a September, 2009 NSF-sponsored meeting in New York City, the NLP community is discussing the standardization and harmonization of the content of manual/automatic linguistic annotation. The meeting is building on the results of the previous Computing Research Infrastructure (CRI) award "Towards a Comprehensive Linguistic Annotation of Language" by establishing standards that researchers and developers are likely to follow. These standards govern tokenization, part of speech, head selection and other basic components of linguistic content that higher level annotation schema assume in common. Once standards are set, violations should be conscious (not accidental) and researchers should justify any violations. The meeting also aims to set up incentives, in the form of grants for small (e.g., student) projects, because several initial standard-compliant annotation projects could plant the seeds needed for the standards to take root.

Intellectual merit: Establishing a common base for linguistic annotation will: (1) make it easier to use, merge and compare different types of annotation (from different transducers, different manual sets of annotation, etc.); (2) make a more rigorous set of annoation standards possible; and (3) facilitate the use of sophisticated natural language informed applications that can draw on annotation created by several different projects simultaneously.

Broader impact: This standardization process will bring about greater cooperation among annotation researchers and, as a result, greatly improve the efficiency of such research. This could significantly improve the state of the art of all linguistic processing, and thus, all applications (automatic search, translation, etc.) that rely on the automatic linguistic analysis of text.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 0948101
Program Officer: Tatiana D. Korelsky

Project Start
Project End
Budget Start: 2009-09-01
Budget End: 2014-08-31
Support Year
Fiscal Year: 2009
Total Cost: $22,500
Indirect Cost

Workshop Proposal: Content of Linguistic Annotation: Standards and Practices (CLASP)
Meyers, Adam
New York University, New York, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments