Research on sentiment and emotion in textual and spoken language relies heavily on human annotation of corpora and human judgments of words in context to provide gold standard data and useful features for prediction. Current resources such as Whissell's Dictionary of Affect in Language (DAL) are of limited utility, due to their limited coverage. To provide richer gold standards, in recent years computational linguists have been making use se of crowd-sourcing websites such as Amazon Mechanical Turk to accomplish annotation tasks that previously would have been done by trained annotators or human subjects in laboratory experiments. Recently there has been some research on using social media websites for similar purposes (mining existing social exchanges for information and creating games to elicit new information). In this project the PI will explore the use of all three of these approaches to augment existing data on human judgments of the emotional content of lexical items to support her ongoing investigation into the classification of emotional text and speech (a major objective of which is to move beyond simple valence judgments relying upon acoustic and prosodic features by taking into account more nuanced aspects of affective text). She will examine the value of crowd-sourcing, social media mining and games implemented in social websites to ascertain what lessons can be learned about acquiring reliable annotations and judgments of emotional content.
Broader Impacts: This exploratory research has the potential to open up new sources of information about the affective connotations of lexical items that will be invaluable to researchers working in text and in speech affect. Advances in the automatic identification of affect in text and in speech would have major applications in fields such as business and medicine, inter alia. Business interests in assessing consumer opinion is rapidly moving from focus groups to analyses of product reviews, while medical informatics researchers similarly try to learn patient attitudes to diagnoses, drugs, and treatments by mining online forums. A major component of such endeavors is the use of affect dictionaries. New and better sources of affective connotations of lexical items should prove enormously helpful in these efforts, increasing our ability to learn useful, practical information from the data individuals provide freely online. The lessons learned through these experiments and results of subsequent data collection will be made available to the larger research community in the form of new affect lexicons.
Our general goal in this project was to bring together research on sentiment analysis from text with studies of emotional speech to automatically identify a wide range of speaker emotions including happiness, sadness, anger, surprise, fear, and disgust from text. These are thought to be the "classic" emotions in the psychology literature. We trained a machine learning classifier on data from LiveJournal (a blog site in which people identify the emotion they most closely associate with their blog post) and on data we collected from annotators asked to describe a large set of scenes. Our particular goal was to be able to determine what sort of emotional facial expressions we should generate in a project we have been working on in text-to-scene generation. This system, called WordsEye, automatically creates scenes from simple text input. We use a third-party program called FaceGen to generate faces with different expressions based upon our work in emotion classification from text. We have published a paper on results of our research in which we report 40-63% success rates in predicting which emotion to generate (Ulinski et al, "Finding Emotion in Image Descriptions," WISDOM 2012. Samples of happy and angry faces generated in WordsEye are included with this report.