Modern automatic speech recognition (ASR) tools offer near-human accuracy in many scenarios. This has increased the popularity of speech-driven input in many applications on modern device environments such as tablets and smartphones, while also enabling personal conversational assistants. In this context, this project will study a seemingly simple but important fundamental question: how should one design a speech-driven system to query structured data? Structured data querying is ubiquitous in the enterprise, healthcare, and other domains. Typing queries in the Structured Query Language (SQL) is the gold standard for such querying. But typing SQL is painful or impossible in the above environments, which restricts when and how users can consume their data. SQL also has a learning curve. Existing alternatives such as typed natural language interfaces help improve usability but sacrifice query sophistication substantially. For instance, conversational assistants today support queries mainly over curated vendor-specific datasets, not arbitrary database schemas, and they often fail to understand query intent. This has widened the gap with SQL's high query sophistication and unambiguity. This project will bridge this gap by enabling users to interact with structured data using spoken queries over arbitrary database schemas. It will lead to prototype systems on popular tablet, smartphone, and conversational assistant environments. This could help many data professionals such as data analysts, business reporters, and database administrators, as well as non-technical data enthusiasts. For instance, nurse informaticists can retrieve patient details more easily and unambiguously to assist doctors, while analysts can slice and dice their data even on the move. The research will be disseminated as publications in database and natural language processing conferences. The research and artifacts produced will be integrated into graduate and undergraduate courses on database systems. The PIs will continue supporting students from under-represented groups as part of this project.

This project will create three new systems for spoken querying at three levels of "naturalness." The first level targets a tractable and meaningful subset of SQL. This research will exploit three powerful properties of SQL that regular English speech lacks--unambiguous context-free grammar, knowledge of the database schema queried, and knowledge of tokens from the database instance queried--to support arbitrary database schemas and tokens not present in the ASR vocabulary. The PIs will synthesize and innovate upon ideas from information retrieval, natural language processing, and database indexing and combine them with human-in-the-loop query correction to improve accuracy and efficiency. The second version will make SQL querying even more natural and stateful by changing its grammar. This will lead to the first speech-oriented dialect of SQL. The third version will apply the lessons from the previous versions to two state-of-the-art typed natural language interfaces for databases. This will lead to a redesign of such interfaces that exploits both the properties of speech and the database instance queried.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2018-10-01
Budget End
2021-09-30
Support Year
Fiscal Year
2018
Total Cost
$500,000
Indirect Cost
Name
University of California San Diego
Department
Type
DUNS #
City
La Jolla
State
CA
Country
United States
Zip Code
92093