The main goal of this project is to develop methods for quantification of facial expressions. Facial expression analysis is being increasingly used in clinical investigations of neuropsychiatric disorders including affective disorders and schizophrenia, which cause deficits in the perception and expression of emotion. However, clinicians still rely on methods of expression rating that are manual, largely qualitative and typically of low reproducibility. This project seeks to develop objective and automated tools, which will significantly augment current capabilities for reliable clinical diagnosis and follow-up. The proposed tools perform a morphometric analysis of fine-grained structural deformations of the face during an expression change. Faces will be represented using deformable models, as a complex combination of elastic regions that deform (expand and contract) as the expression changes. The deformation between two faces with different expressions will be estimated through a high-dimensional shape transformation that will be used to define the quantification measure, using the neutral expression or a standardized template as reference units, depending on the study design. In video sequences, the shape transformation between subsequent frames will be temporally propagated, thereby combining the spatial and temporal information. These methods will be validated against clinically accepted scales of expression rating, in terms of their ability to replicate clinically established results, with emphasis on quantifying difference in expressions between patients with affective disorders and healthy controls. It is expected that upon completion of the project, an integrated collection of expression quantification tools will be provided to clinicians, which will improve diagnostic accuracy in affect-related disorders and provide quantification measures beyond the scope of currently existing clinical techniques. These tools are expected to provide neuropsychiatrists the ability to quantify the degree of impairment in affect expression, quantitatively assess response to medication, obtain behavioral predictors of violence and aggression and find endophenotypic markers in children, adolescents, and family members of patients, which could potentially predict the future onset of the disorder. The long-term goal of the project is to provide methods for expression quantification that are reliable, objective, reproducible and easily usable by clinicians and that will significantly influence the procedures used for accurately diagnosing clinical conditions that cause deficits in emotional expressiveness, such as schizophrenia, affective disorders, Parkinson's disease and senile dementias. ? ?

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-BDCN-B (01))
Program Officer
Huerta, Michael F
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pennsylvania
Schools of Medicine
United States
Zip Code
Savran, Arman; Cao, Houwei; Nenkova, Ani et al. (2015) Temporal Bayesian Fusion for Affect Sensing: Combining Video, Audio, and Lexical Modalities. IEEE Trans Cybern 45:1927-41
Cao, Houwei; Verma, Ragini; Nenkova, Ani (2015) Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech(?) Comput Speech Lang 28:186-202
Cao, Houwei; Savran, Arman; Verma, Ragini et al. (2015) Acoustic and Lexical Representations for Affect Prediction in Spontaneous Conversations. Comput Speech Lang 29:203-217
Cao, Houwei; Cooper, David G; Keutmann, Michael K et al. (2014) CREMA-D: Crowd-sourced Emotional Multimodal Actors Dataset. IEEE Trans Affect Comput 5:377-390
Shah, Miraj; Cooper, David G; Cao, Houwei et al. (2013) Action Unit Models of Facial Expression of Emotion in the Presence of Speech. Int Conf Affect Comput Intell Interact Workshops 2013:49-54
Savran, Arman; Cao, Houwei; Shah, Miraj et al. (2012) Combining Video, Audio and Lexical Indicators of Affect in Spontaneous Conversation via Particle Filtering. Proc ACM Int Conf Multimodal Interact 2012:485-492
Hamm, Jihun; Kohler, Christian G; Gur, Ruben C et al. (2011) Automated Facial Action Coding System for dynamic analysis of facial expressions in neuropsychiatric disorders. J Neurosci Methods 200:237-56
Bitouk, Dmitri; Verma, Ragini; Nenkova, Ani (2010) Class-Level Spectral Features for Emotion Recognition. Speech Commun 52:613-625
Hamm, Jihun; Ye, Dong Hye; Verma, Ragini et al. (2010) GRAM: A framework for geodesic registration on anatomical manifolds. Med Image Anal 14:633-42
Wang, Peng; Verma, Ragini (2008) On classifying disease-induced patterns in the brain using diffusion tensor images. Med Image Comput Comput Assist Interv 11:908-16

Showing the most recent 10 out of 12 publications