This collaborative project, developing and evaluating lifelike, natural computer interfaces as portals to intelligent programs in the context of Decision Support System (DSS), aims at providing a natural interface that supports realistic spoken dialog, non-verbal cues, and the capability of learning to maintain its knowledge current and correct. The research objectives focus around the development of an avatar-based interface with which the DDS user can interact. Communication with the avatar takes place in spoken natural language combined with gesture expressions or by pointing on the screen. The system supports speaker-independent continuous speech input as a spontaneous dialog within the specified DSS domain. A robust backend that can respond intelligently to the questions asked by the DDS user is expected to generate the responses spoken in reply by the avatar with realistic inflection and visual expressions.

The work develops, prototypes, and evaluates the desired user interface capabilities by using the model of a program officer to create a realistic avatar that can answer users' questions and respond in a humanly natural manner. The project extends a current sponsored project on information gathered related to a centers program where a program officer serves as subject matter expert. The recently-developed AlexDSS system that answers questions to users about the I/UCRC program provides the baseline intelligent system behind the avatar. The avatar interfaces are targeted for both general users as well as for experts responsible for updating/correcting the domain knowledge therein.

The work represents a collaborative project between the Intelligent Systems Laboratory (ISL) at UCF and the Electronic Visualization Laboratory (EVL) at UIC. The EVL team focuses on avatar development encompassing Visualization and Interaction with Realistic Avatars and Evaluation of System Naturalness and Usability. The ISL team concentrates on Natural Language Recognition and on Automated Knowledge Update and Refinement.

Project Report

The objective of the subject research was to develop and test the technology to enable the creation of lifelike avatars (also called virtual humans) that could represent the likeness and knowledge of an actual, specific human. This avatar would interact with other human users in spoken natural language and be able to answer questions normally asked of the actual person, in his/her absence. The project sought to achieve naturalness in the communication interchange. There were two aspects to the research: 1) providing the avatar with a physical appearance that was close in likeness to the person whom it represents; and 2) embedding intelligence and knowledge in the avatar so that it is as knowledgeable in a specific topic as the person it represents. This included the ability to remember parts of prior conversations as well as the current conversation. The research was conducted by two collaborating teams, one at the University of Central Florida (UCF) and another at the University of Illinois at Chicago (UIC). The UCF research team addressed the issues of intelligence and knowledge, including its ability to communicate via natural spoken language while the UIC group addressed the computer graphics issues involved with the appearance of the avatar. Figure 1 depicts the final version of the avatar (called the AlexAvatar), made to represent Dr. Alex Schwarzkopf, who at the time was the director of the NSF I/UCRC program. The main finding was that such avatars could be built and made to reflect the appearance and knowledge of its human counterpart. The AlexAvatar was built and evaluated with human test subjects on several occasions. The results showed that although it had some difficulties with the automated speech recognition system, the AlexAvatar was largely able to communicate in natural spoken language and answer general questions, although the questions were necessarily limited. As part of the research, advances of high intellectual merit were made in dialog management. Our approach was to utilize the context of the conversation as the primary driver of the communication exchange. While context has been used extensively in natural language processing, it has only been treated as supplementary information to assist other approaches. Our work used context and the context of the conversation as the primary element for determining the response to a question. Secondly, we built an episodic memory model to allow the avatar to remember important elements of the conversation or of prior conversations. Both components of the intelligent avatar system were tested extensively. The broader impact of the research included the education and training of 23 students at UCF and UIC – seven graduate students (five PhD and two MS) and 16 undergraduate students. All contributed in different ways to the success of the project. Additionally, the findings in this grant led to a second grant from NSF to build a museum exhibit to demonstrate the avatar technology to middle school children. We are confident that this exhibit will help convince some of the museum visitors of that age group to consider careers in STEM fields. The exhibit, set to open at the Orlando Science Center in June of 2014, is themed around Alan Turing and his Turing Test for machine intelligence. The museum visitor is asked to "build" an avatar from a selected group of photographs and then provide it with intelligence so that it can answer questions to determine whether it is a human on a teleconference or an avatar of the same person. The avatars are pre-built, and the visitor only appears to be building it. However, we plan to enhance the exhibit experience to allow the visitor to actually build the avatar’s intelligence and take it home with him/her for further experimentation and enhancement. The main screen features an avatar of Dr. Turing on the left and the avatar "built" by the museum quest on the right. The background is a photograph of a replica of the Colossus computer used in World War II to break the Nazi Enigma code. The dialog of the Turing avatar is scripted and pre-recorded by a human voice over. It acts as the narrator of the exhibit. The other avatar on the right is the one that is able to answer questions from the museum quest. It is not scripted, and uses its artificial intelligence and knowledge, paired with a synthetic voice, to communicate with the museum guest. Our plans are to continue developing this technology and applying it to education, both formal and informal, as well as in health care. In the latter, an avatar could take the place of a doctor or nurse and communicate directly with a patient in the context of telemedicine, albeit within very narrowly defined limits.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Application #
0703927
Program Officer
Rita V. Rodriguez
Project Start
Project End
Budget Start
2007-02-15
Budget End
2014-01-31
Support Year
Fiscal Year
2007
Total Cost
$692,443
Indirect Cost
Name
University of Central Florida
Department
Type
DUNS #
City
Orlando
State
FL
Country
United States
Zip Code
32816