Resources for Word-Sense Identification

Miller, George

Abstract

This is the first year funding of a two-year continuing award. The objective is to provide textual corpora to use in developing and evaluating automatic methods to identify intended senses of polysemous English words, and to make these resources generally available. 1000 English words (nouns, verbs, adjectives, adverbs) that have multiple meanings and occur frequently in printed materials have been selected. Corpora of written English will be searched for occurrences of these words, which will then be classified by hand according to the word's meaning in each context. The result will be a set of files for each word, one file for each attested meaning; the sizes of these files will indicate the relative frequency of occurrence of different senses of these words. Such files can be used to develop automatic methods of word-sense identification, and will make it possible to standardize the evaluation of automatic word-sense identification systems and to evaluate progress in this branch of natural language processing. Better methods of automatic word-sense identification will facilitate language processing in various applications: information retrieval, machine translation, computer-assisted instruction, and elsewhere.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 9528983
Program Officer: Gary W Strong

Project Start
Project End
Budget Start: 1995-12-01
Budget End: 1998-03-30
Support Year
Fiscal Year: 1995
Total Cost: $369,486
Indirect Cost

Resources for Word-Sense Identification
Miller, George
Princeton University, Princeton, NJ, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments