Tools for Lexicon Building" project is preparing a lexical database for English called "Framenet" built on abundant corpus evidence and formulated on the principles of frame semantics. It will be freely available to researchers in several formats, including one that can serve as an accessory to WordNet. The primary deliverable is a highly relational "starter" lexicon containing 5,000 entries covering major semantic domains. Each database entry is associated with the semantic frame in which it participates, and annotated with respect to how it constellates with the elements of that frame, both morphosyntactically and semantically. Entries will also list examples taken from the corpus showing the range of the word's use, including relative frequency data. Another significant development of the project is a set of high-performance computational tools for corpus research, annotation, and analysis (some of which are being developed in collaboration with colleagues in Europe) which will be made available to the research community. The resulting facility, which incorporates both linguistic-semantic and lexicographic techniques, will be valuable to human users of both printed and online works and also to researchers exploring natural language processing, speech recognition, and the complex problems of language understanding.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
9618838
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1997-03-01
Budget End
2000-02-29
Support Year
Fiscal Year
1996
Total Cost
$757,039
Indirect Cost
Name
International Computer Science Institute
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704