This work represents a two-pronged extension of the lexical database WordNet to create a resource suitable for automated word sense disambiguation, multilingual adoption, and crosslinguistic applications. Mapping WordNet's words and concepts into those of other languages requires disambiguation, identification, and discrimination of related entities. We first group WordNet's finely distinguished senses of highly polysemous nouns, verbs, and adjectives into underspecified "super"-senses to facilitate discrimination and disambiguation. Second, we add to the verbs in WordNet sentences illustrating their subcategorization patterns and selectional restrictions so as to better distinguish senses and facilitate matching the verb lexicons of different languages. The project is significant because it creates a resource that meets present demands for multilingual Natural Language Processing applications. Moreover, the augmentations carry considerable theoretical interest. Systematic underspecification of lexical entries might lead towards a psychologically realistic lexicon. A principled grouping of closely related senses will shed light on the nature of polysemy and specific ways in which flexible word meanings yield possible sense extensions in all areas of the lexicon. Subcategorization information and selectional restrictions will reveal the extent to which the semantic relatedness of verbs, expressed in WordNet's relational structure, is correlated with syntactic relatedness.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
9805732
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
1998-11-15
Budget End
2000-10-31
Support Year
Fiscal Year
1998
Total Cost
$99,798
Indirect Cost
Name
Princeton University
Department
Type
DUNS #
City
Princeton
State
NJ
Country
United States
Zip Code
08540