CARD: Corpus Analysis Resources for Discourse

McKeown, Kathleen; Passonneau, Rebecca; Allen, James

Abstract

This is a collaborative effort among three universities (Columbia, Rochester, and Pittsburgh) to construct, evaluate, and disseminate a package of Corpus Analysis Resources for Discourse (CARD). The goal is to provide the means for a large-scale, robust analysis of language use, both within and across distinct types of discourse corpora. The three components of CARD are a Discourse Annotation Language (DAL) to encode information pertaining to language use directly within discourse corpora; reliability measures of the degree of variability in DAL annotations; and a library of DAL-annotated corpora, varying in modality, number of participants, domain, and communicative task. DAL follows the Text Encoding Initiative guidelines and is implemented in Standard Generalized Markup Language to facilitate common authoring and editing utilities. DAL is a modular language with five layers of linguistic representation: morpho-syntactic, prosodic, anaphoric, lexical, and segmental.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9528998
Program Officer: Ephraim P. Glinert

Project Start
Project End
Budget Start: 1996-05-15
Budget End: 2000-04-30
Support Year
Fiscal Year: 1995
Total Cost: $760,821
Indirect Cost

CARD: Corpus Analysis Resources for Discourse
McKeown, Kathleen Passonneau, Rebecca Allen, James
Columbia University, New York, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments