Facial expression is central to human experience. Its efficient and valid measurement is a challenge that automated facial image analysis seeks to address. Currently, few publically available, annotated databases exist. Those that do are limited to 2D static images or video of posed facial behavior. Further development is stymied by lack of adequate training data. Because posed and un-posed (aka ?spontaneous?) facial expressions differ along several dimensions including complexity, well annotated video of un-posed facial behavior is needed. Moreover, because the face is a three-dimensional deformable object, 2D video is insufficient. A 3D video archive is needed.

This project develops a 3D video corpus of spontaneous facial and vocal expression in a diverse group of young adults. Well-validated emotion inductions elicit expressions of emotion and paralinguistic communication. Sequence-level ground truth is obtained via participant self-report. Frame-level ground-truth is obtained via facial action unit coding using the Facial Action Coding System. The project promotes the exploration of 3D spatiotemporal features in subtle facial expression, better understanding of the relation between pose and motion dynamics in facial action units, and deeper understanding of naturally occurring facial action.

The project promotes research on next-generation affective computing with applications in security, law-enforcement, biomedicine, behavior science, entertainment and education. The multimodal 3D video database and its metadata are for the research community for new algorithm development, assessment, comparison, and evaluation.

Project Report

EAGER: Spontaneous 4D-Facial Expression Corpus for Automated Facial Image Analysis Lijun Yin, State University of New York at Binghamton Jeffrey F Cohn, University of Pittsburgh NSF Directorate/Division: CISE/ Division of Information & Intelligent Systems, Program Officer: Dr. Jie Yang NSF Award Numbers: IIS-1051103 (Binghamton) and IIS-1051169 (Pittsburgh) Project Outcomes Report: Facial expression is central to human experience. Its efficient and valid measurement is a challenge that automated facial image analysis seeks to address. Most publically available databases are limited to 2D static images or video of posed facial behavior. Further development is stymied by lack of adequate annotated training data. Because posed and un-posed (aka "spontaneous") facial expressions differ along several dimensions including complexity and timing, well-annotated video of un-posed facial behavior is needed. Moreover, because the face is a three-dimensional deformable object, 2D video is insufficient. A 3D video archive is needed. Thanks to support from NSF grants IIS-1051103 (PI: Dr. Lijun Yin of Binghamton University) and IIS-1051169 (PI: Dr. Jeff Cohn of University of Pittsburgh), a collaborative, inter-institutional research team from Binghamton University and University of Pittsburgh has developed a new 4D video (3D + time) database of spontaneous facial expression (Zhang, Yin, Cohn, Canavan, Reale, Horowitz, & Liu, 2013). Participants were 23 women and 18 men from diverse ancestries that include Asian, African-American, Hispanic/Latino, and Euro-American. Well-validated emotion inductions were used to elicit expressions of emotion and paralinguistic communication. The database includes multiple types of metadata to maximize its information value and usefulness. Sequence-level ground truth was obtained via participant self-report. Frame-level annotation of facial actions was obtained using manual FACS coding (Facial Action Coding System) (Ekman, Friesen, & Hager, 2002). Sixty-six fiduciary facial landmarks were hand labeled in approximately 5% of video frames. The landmarks then were used to train person-specific active appearance models (Baker, Gross, & Matthews, 2004). Facial features were tracked in both 2D and 3D domains using both person-specific (AAM) and generic (constrained local models, or CLM) (Lucey, Wang, Cox, Sridharan, & Cohn, 2009) approaches. To provide benchmarks for automatic facial action unit detection, the project team developed a unique space-time feature – referred to as a Nebular feature – to represent facial actions and detect expressions. Unlike traditional methods that use two-dimensional images or posed 3D facial models, the Nebula feature (Reale, Zhang, & Yin, 2013) uses high-resolution 3D motion models to detect subtle shape and appearance deformations on a 3D surface. This method can recognize facial expressions not only by individual models, but also by their dynamic actions over time. The work promotes the exploration of 3D spatiotemporal features in subtle facial expression, better understanding of the relation between pose and motion dynamics in facial action units, and deeper understanding of naturally occurring facial action. This 4D spontaneous facial expression database is the first of its kind for spontaneous facial expression research. It will be released to the research community in April 2013 at the IEEE International Conference on Automatic Face and Gesture Recognition for use in algorithm development and testing. A complete description of the database together with benchmark findings will be published in the conference proceedings Zhang, Yin, & Cohn et al., 2013). Based on our past experience, we anticipate that the database will serve as a valuable and well-utilized benchmark for research and development in biometrics, security, biomedicine, psychology, affective computing, computer graphics, and related fields. References: Baker, S., Gross, R., & Matthews, I. (2004). Lucas-Kanade 20 years on: A unifying framework: Part 4. IEEE Transactions on Pattern Analysis and Machine Intelligence, 26, 810-815. Ekman, P., Friesen, W. V., & Hager, J. C. (2002). Facial action coding system: Research Nexus, Network Research Information, Salt Lake City, UT. Lucey, S., Wang, Y., Cox, M., Sridharan, S., & Cohn, J. F. (2009). Efficient constrained local model fitting for non-rigid face alignment. Image and Vision Computing Journal, 27(12), 1804-1813. Reale, M., Zhang, X., & Yin, L. (2013). Nebula Feature: A space-time feature for posed and spontaneous 4D facial behavior analysis, Proceedings of 10th IEEE International conference on automatic face and gesture recognition (FG'13), Shanghai, China. April, 2013. Zhang, X., Yin, L., Cohn, J., Canavan, S., Reale, M., Horowitz, A., & Liu, P. (2013). A high-resolution spontaneous 3D dynamic facial expression database, Proceedings of 10th IEEE International conference on automatic face and gesture recognition (FG'13), Shanghai, China. April, 2013.

Project Start
Project End
Budget Start
2010-09-01
Budget End
2012-08-31
Support Year
Fiscal Year
2010
Total Cost
$44,216
Indirect Cost
Name
University of Pittsburgh
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15260