A new generation of robots is emerging which aims to interact with humans on a daily basis to provide service, care, and companionship. To support natural interaction between people and this type of robot, robust language processing enabling situated human robot dialogue will become increasingly important. Compared to traditional spoken dialogue systems and multimodal conversational interfaces, situated human robot dialogue is drastically different due to two unique characteristics. The first characteristic is situatedness. A robot is situated in a physical world that is cohabited with human partners. The spatial relations between the robot, the human, and the environment, and the dynamic nature of the surroundings, have a massive influence on how the robot accomplishes its task and interacts with the human. The second characteristic is embodiment. A robot and its human partner both have physical bodies in the environment. In embodied communication, speakers make extensive use of non-verbal modalities (e.g., eye gaze and gestures) to engage in conversation and make reference to the shared environment. These two characteristics make automated interpretation of human language in situated human robot dialogue extremely challenging. This is funding to provide the PI and her team with an enhanced infrastructure that will enable them to address these challenges. The new resources will include a physical environment and a virtual environment to enable human-centered investigation, tools to capture human multimodal language behaviors that include human speech, eye gaze, and gesture during situated human robot dialogue, and systems to support Wizard-of-Oz experiments, data collection, and data analysis. The integration of a physical world and a virtual world is an innovation of the new infrastructure. Limitations of sensor and effector technology often make it difficult or expensive to change robot configurations and implement desired behaviors. The virtual world paradigm allows efficient and high fidelity simulation of the physical world and robots, as well as studies on multimodal language behavior under many different conditions which are otherwise difficult to obtain in the physical world. The new infrastructure will enable a human-centered approach by facilitating a wide variety of controlled experiments. It will enable new empirical findings and provide a testbed for advanced techniques for multimodal language processing that are psycholinguistically plausible.

Broader Impacts: The new infrastructure will provide tremendous research and collaborative opportunities for the PI and her colleagues at Michigan State University. It will have profound implications in enabling the next generation of social and cognitive robots. This infrastructure will also provide new and exciting training and education opportunities for students at MSU through research mentoring and curriculum development. It will bring new educational experiences to K-12 students and encourage broader participation in engineering through several outreach programs at MSU.

Project Report

A new generation of robots have emerged in recent years to serve as assistants and companions to human partners. To support natural interaction between humans and this type of cognitive robots, technology enabling situated human-robot dialogue has become increasingly important. This project has acquired resources towards a new infrastructure to support research on situated human-robot dialogue. More specifically, through this project, several robots (e.g., PeopleBot, NAO, and a robotic arm) have been acquired together with equipment to track human multimodal language behaviors (e.g., mobile eye tracker, a Kinect, and a stereo camera). This project has also developed software tools for virtual world simulation and created systems for Wizard-of-Oz experiments. The resulting infrastructure has provided many opportunities to conduct research on situated human-robot dialogue. For example, it has been used to develop and evaluate algorithms for collaborative referential grounding in human-robot dialogue. In addition, this project has also provided exciting training and education opportunities for K-12 students and students at MSU. For example, the NAO robot was demonstrated to kindergarten children and the PeopleBot was brought to the classroom to teach concepts in Artificial Intelligence. In the long run, this infrastructure will enable new empirical findings and provide a testbed for advanced techniques for situated multimodal language processing in human-robot dialogue.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Network Systems (CNS)
Type
Standard Grant (Standard)
Application #
0957039
Program Officer
Ephraim P. Glinert
Project Start
Project End
Budget Start
2010-03-01
Budget End
2014-02-28
Support Year
Fiscal Year
2009
Total Cost
$229,275
Indirect Cost
Name
Michigan State University
Department
Type
DUNS #
City
East Lansing
State
MI
Country
United States
Zip Code
48824