Most words have more than one meaning. The standard computational model for word meaning is through lists of dictionary senses. However, choosing the right dictionary sense is a highly difficult task for humans as well as machines. This CAREER project follows the hypothesis, based on current models of human concept representation, that word meaning is better described through a graded notion of similarity than through dictionary senses. The hypothesis is tested through a novel meaning annotation framework and computationally through a vector space model of word meaning. The model uses vector characterizations of typical arguments to compute the meaning of an individual occurrence compositionally from the words in its syntactic context. For evaluation, the project focuses on the ability to draw appropriate inferences, in both shallow and deep frameworks, from similarity-based meaning representations. The research effort goes together with educational work that focuses on supporting undergraduate research, stressing hands-on data exploration and interdisciplinary work.
The characterization of word meaning is a central issue in lexical semantics and in computational linguistics as a whole. This CAREER project will yield a broadly applicable paradigm that describes word meaning without recourse to dictionary senses. It aims both to provide a more cognitively adequate model and to benefit language technology applications, in particular information retrieval, which already relies heavily on vector space models.