7 # 9 9 ) ) ) ) ) 3 = = = = = x x The goal of this project is to develop a sound basis and methodology for the representation of corpora and linguistic information in corpora, as well as for the design of text-handling tools, for use in corpus-based natural language processing (NLP) research. The work is undertaken in collaboration with the Laboratoire Parole et Langage in Aix-en-Provence, France. The project involves (1) analysis of the needs of corpus-based NLP research, both in terms of the kinds and degree of annotation required and the requirements for efficient processing, accessibility, etc.; (2) analysis of general properties and configuration of corpora, analysis of the relevant structural and logical features of component text types, and the design of encoding mechanisms that can represent all required elements and features while accomodating the requirements determined in (1); and (3) specifications for text software design, coordinated with (2), designed to avoid redundancy and maximize the modifiability, extendability, and reusability of corpus-handling software. The methods and materials developed in this project will provide a comprehensive framework for the machine representation and manipulation of large corpora for corpus-based NLP research, thus enabling both software and data to be easily shared, used, and re-used in the future. u h 9 9 9 9 9 9 = / B Ide - Abstract Strong Times 9 9

Agency
National Science Foundation (NSF)
Institute
Division of Atmospheric and Geospace Sciences (AGS)
Type
Standard Grant (Standard)
Application #
9509962
Program Officer
Kenneth H. Schatten
Project Start
Project End
Budget Start
1995-05-15
Budget End
1996-04-30
Support Year
Fiscal Year
1995
Total Cost
$5,400
Indirect Cost
Name
Planetary Science Institute
Department
Type
DUNS #
City
Tucson
State
AZ
Country
United States
Zip Code
85719