Perhaps the most powerful aspect of the human brain, and also the least understood, is its ability to represent and understand language. The present research advances the field by developing mathematical models that try to capture an important aspect of that ability: how the brain represents individual words and how it combines them into phrases and sentences. By extracting measures of meaning from brain activation patterns, this research is of potential relevance to people with neurological language deficits who can represent meaning but who have problems expressing it. Beyond cognitive neuroscience, this work may also have application to improving computers' ability to process natural language, by building and testing more powerful computational models of meaning than are currently available. By bridging between Cognitive Neuroscience, Data Science and Linguistics this work also enables new interdisciplinary training of students.

To carry out this work, experimental and theoretical approaches are combined: functional magnetic resonance imaging (fMRI), behavioural testing and computational modeling. These approaches are brought together by using the shared framework of representing word meanings in what are known as "semantic spaces". In a semantic space, each word is represented as a vector, i.e. as an ordered list of numbers, where each such number quantifies a specific feature of the word's overall meaning. This sort of representation has structure: words with more similar meanings are closer together. The research uses models of semantic space to decode fMRI data, by finding mappings between the structure of the semantic model and the similarity-structure of distributed neural activation patterns. In particular, the work investigates whether greater understanding of neural representational structure can be achieved by combining two seemingly distinct types of semantic model: those derived from co-occurrence frequencies of words in large bodies of text, and those obtained from people's behaviourally measured ratings of features of a word's meaning. The research addresses this question not only for neural representations of isolated words, but also for adjective-noun phrases and for entire sentences. It also seeks to isolate purely meaning-related aspects of the neural signal by distinguishing between words which often co-occur with each other but which have distinct meanings, such as 'cup' and 'coffee', as opposed to words that have genuinely similar meanings, such as 'cup' and 'mug'. The work takes an additional approach to isolating meaning from lower-level features, by performing neural decoding across speakers of different languages, e.g. Chinese and English, which can represent the same meanings as each other but which differ greatly in their sound patterns and written visual appearance. Collectively, these lines of work enable progress on the fundamental problem of how the human brain understands language, by bringing together computation, psychology and neuroscience in novel ways.

Agency
National Science Foundation (NSF)
Institute
Division of Behavioral and Cognitive Sciences (BCS)
Application #
1652127
Program Officer
Jonathan Fritz
Project Start
Project End
Budget Start
2017-01-15
Budget End
2021-12-31
Support Year
Fiscal Year
2016
Total Cost
$407,895
Indirect Cost
Name
University of Rochester
Department
Type
DUNS #
City
Rochester
State
NY
Country
United States
Zip Code
14627