Intelligent assistants such as Apple's Siri, Amazon's Alexa and Microsoft's Cortana are rapidly gaining popularity by providing a conversational natural language interface for users to access various online services and digital content. They allow computing tasks to be performed in contexts where users cannot touch their phones (such as while driving), and on wearable and Internet of Things (IoT) devices (such as Google Home). However, such conversational interfaces are limited in their ability to handle the "long-tail" of tasks and suffer from lack of customizability. This research will explore a new multi-modal, interactive, programming-by-demonstration (PBD) approach that enables end users to add new capabilities to an intelligent assistant by programming automation scripts for tasks in any existing third-party Android mobile app using a combination of demonstrations and verbal instructions. The system will leverage state-of-the-art machine learning and natural language processing techniques to comprehend the user's verbal instructions that supply information missing in the demonstration, such as implicit conditions, user intent and personal preferences. The user's demonstration on the graphical user interface will be used for grounding the conversation and reinforcing the natural language understanding model. The system will point the way to allowing the general public to more effectively use their smartphones, IoT devices and intelligent assistants, increasing the adoption, efficiency and correctness of uses of these technologies. The integration of intelligent assistants with PBD will have broad impact by exposing people to programming concepts in an easy-to-learn way, and thereby increasing computational thinking.

This project will result in several innovations beyond the current state of the art through advances in programming by demonstration (PBD) and intelligent assistants, and especially in their integration. The work will explore leveraging verbal instructions as an additional modality to address long-standing challenges in PBD research including generalizing the data descriptions and adding control structures. How to coordinate the two modalities to help the intelligent assistant learn new tasks effectively and efficiently from users will be investigated, and how users utilize the two modalities in multi-modal PBD systems for programming tasks in different situations will also be studied. New ways to leverage the displayed graphical user interfaces (GUI) of apps to enhance the speech recognition and language understanding by using the strings and other context of the GUI on the smartphone will be developed. The ability of the conversational assistant to participate in this generalization process will be enhanced, with a focus on having the system ask appropriate and helpful questions so the task automation will fit the user's needs and intentions. New approaches to representing scripts created by PBD systems that users can read, understand and edit will be explored, as will increasing trust and usefulness of the scripts and supporting error handling, debugging and maintenance. The new technology will also be able to extract data from and enter data into apps, and to learn, through demonstration and verbal instruction, how to transform the data into appropriate formats. Finally, how to support sharing of scripts created by PBD systems while ensuring the appropriate levels of privacy and security will also be investigated.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1814472
Program Officer
Ephraim Glinert
Project Start
Project End
Budget Start
2018-08-15
Budget End
2021-07-31
Support Year
Fiscal Year
2018
Total Cost
$531,019
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213