The first aspect of this research develops an expressive visual speech synthesis module for a prototype virtual actor. The goal is to provide virtual actors with realistic and expressive facial animation given a speech input in real-time. Facial motion in general is one of the most challenging problems in computer animation, since the human face is the most complex muscular region of the human body. At the same time, expressive visual speech is critical for virtual humans that need to interact orally in diverse scenarios. The second aspect of this research addresses the problem of expressive visual speech synthesis using a machine learning approach that relies on a database of speech related high-fidelity facial motions. From this training set, a generative model of expressive facial motion is derived that incorporates emotion control while maintaining accurate lip-synching. The emotional content of the input speech can be manually specified by the user or automatically extracted from the audio signal using a Support Vector Machine classifier. This research introduces a novel real-time search-based approach for facial motions synthesis. In addition, it is the first piece of work that develops and incorporates an emotion mapping model as an integral part of the facial motion synthesis process.
Broader Impact. Results will have immediate impact on the entertainment industry and on computer assisted education. Our work provides a missing piece towards the ambitious goal of developing expressive virtual humans. Real-time high quality facial motion synthesis is crucial for interactive games that involve virtual human characters. The US gaming industry is a billion-dollar industry that employs thousands of US citizens. This research makes significant advances in this area and can give this industry an important competitive advantage. There are a growing number of social and educational applications that can benefit from high quality and interactive virtual humans, such as interactive training systems, and virtual tutors. In the future, inexpensive virtual tutors could be used to provide customized tutoring to under-achieving children or children with special needs. This research pushes the envelope of the state of the art in facial animation, which, once mature, has the potential to revolutionize the way computers interact with humans.