Effective communication of emotion relies on a complex interplay of complementary non-verbal channels involving facial and vocal changes. In addition to emotion expression, imitation of emotion also plays a major role in understanding emotion by replicating it Effective use of these face-voice channels and identification of the changes therein is impaired in many patients suffering from neuropsychiatric disorders Quantification of such deficits will help advance basic and clinical research, eventually leading to improved diagnostic accuracy and assessment of treatment effects. Current methodology relies most heavily on clinical ratings that may be subjective and visual rater dependent. The limited automated methods available for facial expression and voice analysis, are able to recognize the emotion, but have been fairly unsuccessful in quantifying the degree of emotion. This has created the need for objective automated methods of emotion evaluation that can quantify emotion, supplement clinical ratings and aid in diagnosis decisions. This project seeks to address these issues by developing and validating advanced automated computerized tools that can objectively and reliably quantify multimodal affect processing. This comprehensive quantified assessment of emotion expression and imitation using single or combined audio-visual modalities of facial expression and voice, will determine the impact of each channel on emotion understanding and on identification of affect related differences between patient-control groups, thereby complementing and augmenting current clinical symptom rating scales. The measures we produce will be easy to employ and could facilitate large-scale studies measuring impairment in affect and affect change across disorders that lead to impaired affect.
In Aim 1 we will develop and validate classifier-based methods for facial affect analysis based on automated temporal action unit profiles, for quantifying facial emotion expression and imitation in the presence of speech.
In Aim 2, we develop and validate emotion classifiers based on the spectral and prosodic features extracted from the acoustic signal. These will quantify emotion in expressed and imitated voice. Finally in Aim 3, we will create a video-based automated emotion expression quantification system that fuses facial and voice features identified in Aims 1 and 2. The population-specific set of face-voice classifiers designed will best elucidate patient-control differences in expression and imitation. Results will be compared to to clinical ratings. We expect that on successful completion of the project we will have an integrated collection of objective facial and speech expression analysis tools, usable by neuropsychiatrists to quantify the degree of emotion impairment and study treatment effects. We expect our methods to influence procedures used for diagnosing schizophrenia and perhaps affective disorders and autism spectrum disorders. The methods will be generic and could be further extended to other neuropsychiatric or neurological conditions that cause deficits in emotional expressiveness.

Public Health Relevance

The project seeks to quantify emotion in facial expression and voice by developing advanced computational tools that will objectively alleviate the challenges faced by subjective methods of emotion evaluation used by neuropsychiatrists to study disease induced emotion production impairments. These well validated tools will be applied to video datasets of patients with schizophrenia and controls to determine group differences and study disease progression and treatment effects.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Neural Basis of Psychopathology, Addictions and Sleep Disorders Study Section (NPAS)
Program Officer
Freund, Michelle
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pennsylvania
Schools of Medicine
United States
Zip Code
Hamm, Jihun; Kohler, Christian G; Gur, Ruben C et al. (2011) Automated Facial Action Coding System for dynamic analysis of facial expressions in neuropsychiatric disorders. J Neurosci Methods 200:237-56
Hamm, Jihun; Ye, Dong Hye; Verma, Ragini et al. (2010) GRAM: A framework for geodesic registration on anatomical manifolds. Med Image Anal 14:633-42
Wang, Peng; Barrett, Frederick; Martin, Elizabeth et al. (2008) Automated video-based facial expression analysis of neuropsychiatric disorders. J Neurosci Methods 168:224-38
Wang, Peng; Verma, Ragini (2008) On classifying disease-induced patterns in the brain using diffusion tensor images. Med Image Comput Comput Assist Interv 11:908-16
Alvino, Christopher; Kohler, Christian; Barrett, Frederick et al. (2007) Computerized measurement of facial expression of emotions in schizophrenia. J Neurosci Methods 163:350-61