The workshop gathers developers and users of lexical resources, corpus and computational linguists and researchers in natural language processing to discuss a targeted restructuring of the adjectives in the lexical database WordNet. Specific proposals for replacing a subset of the current clustering of adjectives around antonyms with ordered scales reflecting the relative intensity of dimensional adjectives, such as "big", "huge" and "gigantic", are presented along with preliminary work demonstrating the feasibility of corpus-based construction of scales by means of lexical-semantic patterns and their potential benefits for NLP. Discussion topics include (1) the principal benefits of encoding scalar properties for applications including word sense disambiguation, textual entailment and language pedagogy; (2) suitable corpora for extracting data for scale construction; (3) limitations of the recently-developed AdjScales method and alternative or complementary methods for extracting scalar properties; and (4) modeling of scalar adjectives in WordNet. Participants evaluate the proposed restructuring of adjectives for its feasibility, value and relevance to their own work and its potential for future research and applications. A report including the presentations, discussions and recommendations of the group will be prepared and freely disseminated via the WordNet website.

The directions for targeted future developments of the widely used WordNet database as spelled out and agreed upon by representatives from a broad expert community assure significant consequences for research and applications in language technology and pedagogy. For a post-doctoral fellow and a graduate student the workshop provides a unique opportunity to interact with experts in the field.

Project Report

A workshop was organized to gather representative members of the scientific community with the aim of discussing specific enhancements to WordNet, a large electronic lexical database of English that serves as a widely used tool for Natural Language Processing. WordNet supports Word Sense Disambiguation largely based on measures of the distance among words within its semantic network structure. Despite its huge popularity, WordNet’s potential is not fully exploited; in particular, its 18,000 adjectives remain essentially untapped. Their current organization into unconnected clusters is not conducive to measuring semantic similarity via edge counting. Moreover, "similar" and "see also" pointers among adjectives are both vague and heterogeneous enough so as to be rather unhelpful for determining the nature of the semantic similarity among interconnected adjectives. The workshop explored the reorganization of a core subset of frequent English adjectives in WordNet and its potential benefits for the computational linguistics and NLP communities. The targeted adjectives have scalar properties, i.e., they express different degrees of a shared attribute. For example, cool, chilly and frigid express different degrees of cold and can be placed on a scale reflecting their relative intensity. Sheinman and Tokunaga (2009) developed a method that allows the extraction of adjectives expressing different intensities of a shared underlying attribute by means of lexical-semantic patterns. Thus, a "strong" pattern like X if not Y has the stronger adjective on the right (good if not great), while "weak patterns" like not X but Y enough (not great but good enough) show the strong adjective to the left of the weaker one. The scalar relations are downward entailing, in that the stronger adjective always implies the weaker one, but the converse is not necessarily true. The AdjScale method has been tested on a small scale, and plans are to apply the method to a many more adjectives. A representation of scales in WordNet was discussed that would reflect essential and subtle semantic properties of scalar adjectives and encode their relative placement on a common scale. The workshop participants discussed the methods, advantages, limits, and benefits for representing scalar adjectives in WordNet for Word Sense Disambiguation, Reading Textual Entailment and language pedagogy. For a post-doctoral fellow and a graduate student the workshop provides a unique opportunity to interact with experts in the field. The workshop discussions laid the groundwork for the future exploration of two additional adjective classes that block or allow inferences depending on their syntactic context.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1139844
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2011-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2011
Total Cost
$20,000
Indirect Cost
Name
Princeton University
Department
Type
DUNS #
City
Princeton
State
NJ
Country
United States
Zip Code
08544