The last decade has seen the growth of extremely large, unstructured, and dynamic data sets, loosely termed Big Data. However, there is a growing desire to extract not just specific properties of collections of such facts, but also relationships between the underlying entities in that data. Examples come from a broad swatch of modern life: bioinformatics, financial, recommendation systems, cyber and national security, and social networks. Graphs have emerged as a valuable and productive paradigm for expressing such problems, where a graph is a collection of a set of objects (vertices) where some pairs of objects are connected by links (edges) that represent some relation between the two. In the last decade there has been an explosion in support for graphs, with widely differing execution models and targeted applicability. Although numerous graph benchmarks have been proposed, only one has had a rigorous accumulation of performance data from multiple platforms (www.graph500.org). Computation is over a whole static graph, whereas the real world sees applications where update data is streaming into large persistent graphs, and very many small targeted queries may be in progress at once. Given the expected productivity increase of using a graph programming paradigm over conventional programming, especially for parallel systems, it is of growing importance to have common mini-apps that can be used for cross-paradigm comparisons. Also, given the continued increase in graph sizes, it is important to understand how the underlying graph engines scale both in the size and type of the target graphs and in the amount and mix of parallelism and concurrency they can support. This project addresses this need. In collaboration with commercial and government research labs, the primary objective is on defining a set of mini-apps that reflect complex real-world applications more sophisticated than today's simple benchmarks, converting these mini-apps to the existing major graph packages, and then running them on a wide range of parallel systems. The wider impact can be significant. Identification of relevant mini-apps and how they perform across different systems will provide insight into both how to write more complete graph applications in more scalable ways, and which aspects of which programming systems and platforms are best suited. It is also expected that not all mini-apps will be expressible in all the current paradigms, providing insight to the developers of those paradigms on expressibility issues. Given the relative infancy of such graph packages such insight now can radically improve their applicability to real applications in the future.

Agency
National Science Foundation (NSF)
Institute
Division of Computer and Communication Foundations (CCF)
Type
Standard Grant (Standard)
Application #
1642280
Program Officer
Almadena Chtchelkanova
Project Start
Project End
Budget Start
2016-08-01
Budget End
2019-07-31
Support Year
Fiscal Year
2016
Total Cost
$307,369
Indirect Cost
Name
University of Notre Dame
Department
Type
DUNS #
City
Notre Dame
State
IN
Country
United States
Zip Code
46556