As we talk with others, facial expressions, head movements and nonverbal speech characteristics serve as crucial cues to our emotional tone. For this process to be effective, the cues to expressed affect must change constantly and are difficult to study. The lack of tools for measuring and modeling the multiple cues used to interpret live facial expressions has limited the scientific study of the regulation of facial expression in conversation. This proposal consists of experiments designed to manipulate the specific cues thought to determine perceivers' understanding of communicators' emotional states during live conversations. The primary goals of this project are twofold. The first goal is to develop and refine tools (e.g., computer software programs) that will allow scientists to better understand the specific facial and movement cues that allow people to express and understand communicated affect during live conversation. The second goal is to refine the software that allows such tests of affective communication, so that it serves as a new tool for a host of behavioral scientists who wish to study and manipulate the nature of live interactions. For example, this software will allow experimenters to manipulate the perceived gender of two people who are having a live getting-acquainted conversation -- so that a man who is actually talking to another man sees a realistic, convincing facial avatar of a woman (who perfectly mimics the facial and head movements of the actual male interaction partner). This software developed and refined in this proposal will thus allow social and cognitive psychologists to manipulate, for example, the social categories of people's interaction partners in ways not imagined a decade ago.
The impact of this project is exceptionally wide ranging. One area of application is measuring and modeling facial dynamics -- that is, using this technology to further study live facial expression. The current project will provide open source software that can be distributed free for research purposes. Research labs who wish to use the software will also be provided with free supporting materials (e.g., plans, equipment lists) to facilitate the dissemination of the technology resulting from the current project. A second area of application is the study of intercultural communication. This new technology will allow sophisticated studies of social interactions and communication in small group, high stress settings in which emotional regulation is critical (e.g., intercultural interactions by police or military personnel) or in negotiation settings (e.g., diplomatic relations). Further the technology could be used to allow live video interactions between people that maintain the tone and content of communications while still maintaining speaker confidentiality. A third area of application is educational technology. A problem with virtual learning environments is the difficulty of detecting students' nonverbal cues -- cues that a live classroom teacher often uses to assess student understanding or confusion. The work may lead to automatic recognition of these cues from video capture. This information could then be used to improve the virtual learning environment or refine the specific teaching materials that are broadcast to students.
Thin Slice Video Ratings of Emotion Untrained raters saw short five second videos of people they had never met and were asked to rate what the person was feeling by saying how accurate were statements such as, "The person in the video felt happy." The videos were extracts from real conversations between unacquainted young adults and showed only the one of the two people in the conversation as in Figure 1-a. The raters tended to agree about what the person in the video was feeling during that five seconds. A statistical method called factor analyses with oblique rotation found six factors of emotion used by the raters: (1) Anger, (2) Joy, (3) Anxiety, (4) Sadness, (5) Shame, (6) Compassion. Raters in a second experiment performed the same task, except that the video clips were presented silently. The same six factors of emotion were found, suggesting that these dimensions of emotion are present both in the audio track and in the facial expressions alone. Software A new graphical user interface (FaceModelBuilder) for specifying Combined Appearance Models (CAMs) was written and is in beta testing. The software tracks facial expressions in real time as shown in Figure 1-b. The interface is written in PyQT so that it can be run on Macs, Windows PCs, and Linux machines. FaceModelBuilder is in beta testing at the University of Virginia and the Max Planck Institute for Human Development in Berlin. FaceModelBuilder is under the Apache 2.0 license so that, once it is released, it will be free and open source. Tracking Emotion A computer algorithm was developed that uses the CAM software to track facial expressions and output an estimate of the previously described six emotion factors. This effectively maps the correlated space of emotion words onto a person--specific model of facial expression as shown in Figure 1-c. The method is fast and can be used to generate six emotion scores for each video frame in real time. We are now working on validating and improving the ratings from this method.