American Sign Language (ASL) grammar is specified by the manual sign (the hands) and by the nonmanual components, which include the face. Our general hypothesis is that nonmanual facial articulations perform significant semantic and syntactic functions by means of a more extensive set of facial expressions than that seen in other communicative systems (e.g., speech and emotion). This proposal will systematically study this hypothesis. Specifically, we will study the following three hypotheses needed to properly answer the general hypothesis stated above: First, we hypothesize (H1) that the facial muscles involved in the production of clause-level grammatical facial expressions in ASL and/or their intensity of activation are more extensive than those seen in speech and emotion. Second, we hypothesize (H2) that the temporal structure of these facial configurations are more extensive than those seen in speech and emotion. Finally, we hypothesize (H3) that eliminating these ASL nonmanual makers from the original videos, drastically reduces the chances of correctly identifying the clause type of the signed sentence. To test these three hypotheses, we define a highly innovative approach based on the design of computational tools for the analysis of nonmanuals in signing. In particular, we will examine the following three specific aims.
In Aim 1, we will build a series of computer algorithms that allow us to automatically (i.e., without the need of any human intervention) detect the face, its facial features as well as the automatic detection of the movements of the facial muscles and their intensity of activation. These tools will be integrated into ELAN, a standard software used for linguistic analysis. These tools will then be used to test six specific hypotheses to successfully study H1.
In Aim 2, we define computer vision and machine learning algorithms to identify the temporal structure of ASL facial configurations and examine how these compare to those seen in speech and emotion. We will study six specific hypotheses to successfully address H2. Alternative hypotheses are defined in both aims. Finally, in Aim 3 we define algorithms to automatically modify the original videos of facial expression in ASL to eliminate the identified nonmanual markers. Native users of ASL will complete behavioral experiments to examine H3 and test potential alternative hypotheses. Comparative analysis with non-signer controls will also be completed. These studies will thus further validate H1 and H2. We provide evidence of our ability to successfully complete the tasks in each of these aims.
These aims address a critical need; at present, the study of nonmanuals must be carried out by hand. To be able to draw conclusive results, it is necessary to study thousands of videos. The proposed computational approach supposes at least a 50-fold reduction in time compared to methods done by hand.

Public Health Relevance

Deafness limits access to information, with consequent effects on academic achievement, personal integration, and life-long financial situation, and also inhibits valuable contributions by Deaf people to the hearing world. The public benefit of our research includes: (1) the goal of a practical and useful device to enhance communication between Deaf and hearing people in a variety of settings; and (2) the removal of a barrier that prevents Deaf individuals from achieving their full potential. An understanding of the nonmanuals will also change how ASL is taught, leading to an improvement in the training of teachers of the Deaf, sign language interpreters and instructors, and crucially parents of deaf children.

National Institute of Health (NIH)
National Institute on Deafness and Other Communication Disorders (NIDCD)
Research Project (R01)
Project #
Application #
Study Section
Language and Communication Study Section (LCOM)
Program Officer
Cooper, Judith
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Ohio State University
Engineering (All Types)
Biomed Engr/Col Engr/Engr Sta
United States
Zip Code
Benitez-Quiroz, Fabian; Srinivasan, Ramprakash; Martinez, Aleix M (2018) Discriminant Functional Learning of Color Features for the Recognition of Facial Action Units and their Intensities. IEEE Trans Pattern Anal Mach Intell :
Pumarola, Albert; Agudo, Antonio; Martinez, Aleix M et al. (2018) GANimation: Anatomically-aware Facial Animation from a Single Image. Comput Vis ECCV 11214:835-851
Benitez-Quiroz, Carlos F; Srinivasan, Ramprakash; Martinez, Aleix M (2018) Facial color is an efficient mechanism to visually transmit emotion. Proc Natl Acad Sci U S A 115:3581-3586
Zhao, Ruiqi; Wang, Yan; Martinez, Aleix M (2018) A Simple, Fast and Highly-Accurate Algorithm to Recover 3D Shape from 2D Landmarks on a Single Image. IEEE Trans Pattern Anal Mach Intell 40:3059-3066
Martinez, Aleix M (2017) Visual perception of facial expressions of emotion. Curr Opin Psychol 17:27-33
Martinez, Aleix M (2017) Computational Models of Face Perception. Curr Dir Psychol Sci 26:263-269
Zhao, Ruiqi; Martinez, Aleix M (2016) Labeled Graph Kernel for Behavior Analysis. IEEE Trans Pattern Anal Mach Intell 38:1640-50
Benitez-Quiroz, C Fabian; Wilbur, Ronnie B; Martinez, Aleix M (2016) The not face: A grammaticalization of facial expressions of emotion. Cognition 150:77-84
Srinivasan, Ramprakash; Golomb, Julie D; Martinez, Aleix M (2016) A Neural Basis of Facial Action Recognition in Humans. J Neurosci 36:4434-42