American Sign Language (ASL) animations have the potential to make information accessible to many deaf adults in the United States who possess only limited English literacy. In this research, which involves collaboration across three institutions, the PIs' goal is to gain a better understanding of ASL linguistics through computational techniques while advancing the state of the art in the generation of ASL animations for accessibility applications for people who are deaf. To these ends, the PIs will develop linguistically based models of two aspects of ASL production: movements required for head gestures and facial expressions that carry essential grammatical information and frequently extend over domains larger than a single sign, and the timing and coordination of manual and non-manual elements of ASL signing. Preliminary work has shown that these issues significantly affect how well signers understand ASL animations, and that these aspects of current ASL animation technologies require improvement. How should the face of a human or animated character be articulated to perform, with accuracy, the linguistically meaningful facial expressions that are part of ASL grammar? How should the onsets, offsets, and transitions of these movements be produced? How should the facial expressions and hand movements be temporally coordinated so that the ASL production is as grammatically correct and understandable as possible? To answer open questions such as these, the PIs' novel approach will apply techniques from computer vision to linguistically annotated video data collected from human signers, in order to produce models for use in animation-production. The PIs will expand their existing annotated video ASL corpora through new data collection and annotation, and will analyze these data to study the use, timing, and synchronization of manual and non-manual components of ASL production. The annotated videos will be used to train high quality computer vision models for recognition of linguistically significant facial expressions and timing subtleties. Parameters of these computer vision models will be used to hypothesize computational models of ASL timing and facial movements, to be incorporated into ASL-animation generation software and evaluated by native signers. The models will be iteratively refined in cycles of user-based studies and incorporated into ASL animation technologies to more accurately mimic human signing. Project outcomes will include high quality models of the movement of virtual human characters for animations of ASL performance. The analysis of video corpora of ASL will produce new linguistic insights into the micro-facial expressions and the temporal coordination of the face and hands in ASL production, while advances in the analysis of ASL prosody will contribute to an understanding of the fundamental commonalities and modality-specific differences between signed and spoken languages that is essential to a full understanding of the human language faculty. The creation of new modeling approaches and recognition techniques will advance the field of computer vision, by benefiting the identification and tracking of the human face and body in video during the rapid and complex movements of ASL (and other forms of human movement).

Broader Impacts: This research will lead to significant improvements to technology for generating linguistically accurate ASL animations, which will make information, applications, websites, and services more accessible to the large number of deaf individuals with relatively low English literacy. Advances in computer vision techniques for recognizing ASL in videos of humans will have general applicability in human-computer interaction, recognition and animation of facial expressions, and computer vision. The corpora created in this project will enable students and researchers in both linguistics and computer science (including those without access to the requisite technological and human resources to carry out their own data collection from native signers and time-intensive linguistic annotations) to engage in research on ASL. The techniques to be developed will also enable partial automation of the time-consuming creation of annotated ASL video corpora. As in the PIs' earlier work, the proposed research will create opportunities for people who are deaf and members of other underrepresented groups to participate in scientific research.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1064965
Program Officer
Ephraim Glinert
Project Start
Project End
Budget Start
2011-07-01
Budget End
2016-06-30
Support Year
Fiscal Year
2010
Total Cost
$469,996
Indirect Cost
Name
Rutgers University
Department
Type
DUNS #
City
Piscataway
State
NJ
Country
United States
Zip Code
08854