In recent years, computer systems designed to understand ordinary language have improved rapidly. These advances depend in part on improving the language resources that help systems recognize different senses of individual words and how the parts of a sentence fit together.

People understand words largely by relating them to common situations; in 'She tossed the letter across the table to Jerry', one recognizes that the main idea comes from 'toss', and understands the different roles that 'she', 'the letter', 'across the table', and 'to Jerry' play. For the past fifteen years, the FrameNet team at the International Computer Science Institute has been defining situations, called semantic frames, and labeling examples with semantic roles on the constituents of illustrative sentences. The FrameNet knowledge database contains over 1,100 semantic frames, ranging from getting a job to curing a disease, and almost 200,000 labeled sentences. Other researchers have created software to automatically label sentences in open text based on FrameNet data; these automatic semantic role labeling (ASRL) systems facilitate the automatic recognition of events and their participants in documents ranging from news stories and military reports.

This award funds a one-week workshop, September 9-13, 2013, introducing FrameNet to a wider range of industry and academic participants. Speakers include the FrameNet team as well as developers and users of ASRL systems. Topics covered range from implications of FrameNet for protecting privacy to using frames in understanding metaphor. Software developers have a hands-on coding session using the new FrameNet library for NLTK; other participants perform hands-on frame definition and annotation.

Project Report

FrameNet is a computational linguistic research project housed at the International Computer Science Institute (ICSI), in Berkeley, CA. FrameNet is building a highly detailed on-line dictionary, explaining the many meanings of common English words and documenting how they are used in sentences, based on the theory of Frame Semantics. The FrameNet database contains 200,000 manually annotated examples of usage, providing detailed information about how semantic and grammatical patterns link together for more than 13,000 of the most frequent word senses in English. The semantic components are represented as Semantic Frames that characterize events, relations, states, or entities. Each frame also represents the participants in the given situation; for example, FrameNet has defined frames for visiting, which include roles for the Guest, the Host and the Location, and steps in the process, such as arriving, staying, and leaving. The more than 1,100 frames are linked together by frame-to-frame relations, such as those linking together the Criminal Process scenario, with frames for committing a crime, arrest, trial, etc. (shown in attached diagram 1). The FrameNet database is freely available, and thousands of research and development projects around the world use FrameNet data to help analyze events and relations in text as part of NLP systems for commercial and government clients. This award provided funding for two FrameNet workshops. The first, Sept. 9-14, 2013, was a week-long educational meeting held at ICSI. The main purpose was to explain she structure of the FrameNet lexical database (shown in attached diagram 2) to both current and prospective users. FrameNet staff members gave in-depth presentations on Frame Semantics, how frames are defined, how manual annotation is carried out, experiments with crowdsourcing for FrameNet annotation, and current parallel research on FrameNets in more than ten other languages. Guest speakers described how they use FrameNet data: Dr. Dipanjan Das of Google described the system for automatic frame semantic role labeling that he built as part of his Ph.D. research at Carnegie Mellon University (CMU). Dr. Nancy C. Chang of Google discussed the representation of events in FrameNet and how that connects with Google research on event categorization and natural language understanding. Tim Hawes of Decisive Analytics Corporation reported on his company's use of FrameNet data for semantic role labeling of text and relational sentiment analysis for military and civilian clients. Dr. Josef Ruppenhofer of Hildesheim University, Germany, talked about his research on deep sentiment analysis based on semantic frames. Dr. Gerald Friedland of ICSI reported on prospects for research using frame semantics to analyze issues surrounding privacy on the Internet. Finally, Dr Nathan Schneider of CMU announced the development of a Python API for the FrameNet data; one day of the workshop was devoted to a hand-on session in which participants either attempted to define new frames for domains they chose or analyzed existing FrameNet frames using the new software tools. The second workshop was held June 29, 2014 in Baltimore, MD, just after the annual meeting of the Association for Computational Linguistics. Most of the participants were already users or developers of FrameNet, so the meeting was an opportunity for FrameNet staff to receive feedback about making FrameNet more useful, and ideas from other developers about improving the database. Researchers from CMU discussed two directions of current research: (1) combining the frames in a sentence to produce a connected graph which can represent the meaning of even a complex sentence, and (2) cross-training the semantic role labeling software on parallel sentences in other languages. Staff from Decisive Analytics Corp. outlined the need for better ways of explaining Frame Semantics to their clients, and better interfaces for searching data using semantic frames. The CEO of Northside Software, a computer game company, talked about how its human-computer dialog systems work and plans to improve them by incorporating FrameNet data, allowing the system to recognize the current situation and plan for the next step in the dialog. There was general agreement on the need to improve communication among users of FrameNet data and build a strong community of users. Some improvements to the FrameNet website were suggested as important steps toward that goal, including new types of interactive and online search over the data.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1346605
Program Officer
Tatiana Korelsky
Project Start
Project End
Budget Start
2013-08-01
Budget End
2015-01-31
Support Year
Fiscal Year
2013
Total Cost
$30,000
Indirect Cost
Name
International Computer Science Institute
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94704