Graphs are powerful tools for representing real world networked data in a wide range of scientific and engineering domains. As examples, graphs are used to represent people and their interactions in social networks, or proteins and their functionality in biological networks, landmarks and roads in transportation networks, etc. Understanding graph properties and deriving hidden information by performing analytics on graphs at extreme scale is critical for the progress of science across multiple domains and solving real world impactful problems. Cloud platforms have been adopted to perform extreme scale graph analytics. This has led to exponential increase in the workloads while at the same time the rate of performance improvements of cloud platforms has slowed down. To address this, cloud platforms are being augmented with accelerators. However, the expertise required to realize high performance from such accelerator enhanced cloud platforms will limit their accessibility to the broader scientific and engineering community. To address this issue, this project will research and develop a toolkit to provide Graph Analytics as a Service to enable researchers to easily perform extreme scale graph analytics workflows on accelerator enhanced cloud platforms. This will significantly increase the productivity of the researchers as i) the researchers will avoid the steep learning curve of developing parallel implementation of graph analytics algorithms, and ii) the increased size and scale of graph analytics will allow researchers to analyze significantly large datasets at reduced latency thereby enriching the quality of the domain research. Moreover, the techniques developed in this project will also be applicable for performing streaming graph analytics at the "edge" for applications such as autonomous vehicles, smart infrastructure, etc. The toolkit is expected to be used in many engineering and science disciplines including power systems engineering, network biology, preventive healthcare, smart infrastructure, etc. The research conducted in this project will also constitute materials appropriate for inclusion in graduate and undergraduate courses.

The project will research and develop high performance graph analytics algorithms and software for key graph workflows and kernels spanning multiple scientific and engineering domains. The target platform will be accelerator enhanced cloud platforms consisting of emerging node architectures comprising of multi-core processors, Field Programmable Gate Arrays (FPGAs) and high bandwidth memory (HBM) with cache coherent interface. An integrated optimization framework consisting of memory optimizations and partitioning and mapping techniques will be developed to exploit the heterogeneity of the target platforms. Specifically, techniques for optimal memory data layout and integrated optimizations for cloud execution will be developed to realize scalable performance in accelerator enhanced cloud platforms. The memory data layout optimization seeks to fully exploit the high bandwidth provided by HBM by ensuring data reuse for a broad class of graph analytics problems. The proposed software will ensure seamless parallel processing of the entire graph on a single heterogeneous node architecture as well as cloud platforms with multiple heterogeneous nodes. The integrated optimization framework will be developed into a scalable, deployable, robust Cyber Infrastructure (CI) toolkit to provide Graph Analytics as a Service (GAaaS). The framework will be developed using state-of-the-art heterogeneous platforms. By accelerating graph analytics workflows on cloud platforms, this project will enable researchers to perform extremely large-scale graph analytics workflows which are key components of many scientific and engineering domains.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
1911229
Program Officer
Seung-Jong Park
Project Start
Project End
Budget Start
2019-06-01
Budget End
2022-05-31
Support Year
Fiscal Year
2019
Total Cost
$481,837
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90089