The continuously increasing demand for fast and real-time data analytics requires that in-memory database systems provide both high-performance transaction processing and low-latency query response time. To accomplish these two goals is a highly challenging task as multiple workloads are co-run on conventional CPU/multicores due to different workload characteristics and different quality of service requirements. The rise of general-purpose graphical processing units (GPU) brings a decoupling opportunity of maximizing query execution performance while minimizing interference to transaction processing. However, best utilization of such a hardware device in database systems must effectively address several mismatches between the parallel computing-oriented architecture of GPU and the data processing-oriented database query workloads. This project seeks a solution of building a GPU-based query execution engine and offers in-memory database systems as an unprecedented opportunity and capability to execute both transaction workloads and analytics workloads in a high-performance and low-cost way. This project attempts to make a high broader impact by transforming basic research results into practical database systems, and by training both undergraduate and graduate students with research activities, and by timely introducing new research results into classrooms.

Specifically, the project will address three mismatching issues, including (1) the one between high parallel computing power in GPU and slow data transfer speeds from/to main memory; (2) the one between GPU's limited programming capabilities and database query's complex structures; and (3) the one between the availability of massive parallel processing cores and the lack of system software support for concurrent task executions and memory management. This project will apply a holistic system design methodology to cohesively address these challenges by carrying out several closely related research tasks. (1) The project team will develop a software translation framework with multi-level abstractions and optimizations to automatically translate Structured Query Language queries into highly-efficient programs running on GPUs. (2) The team will develop query execution cost models by considering GPU performance factors to support query optimization and dynamic re-optimization. (3) The team will design specific algorithms and query execution techniques to efficiently handle two complex but practically important queries, namely nested sub-queries and recursive queries. (4) The team will build a software layer for system-level device memory management and dynamic query scheduling. (5) A final task will be to integrate the project outcome into representative open source in-memory systems with comprehensive workloads. The research efforts in this project will be open source.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1718450
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2017-09-01
Budget End
2021-08-31
Support Year
Fiscal Year
2017
Total Cost
$500,000
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210