Over the past two decades, massive networks or graphs appear in the very fabric of society. For example, networks appear in the form of social networks such as people on Facebook, sharing networks such as Twitter, computer networks such as the routers of the Internet, and power systems. Understanding the structure of these real-world networks is a fundamental scientific challenge. A significant aspect to this structure is the existence of "communities", or tightly knit collections of objects in the network with a large number of connections within them; examples iwould include a large group of mutual friends in a social networks, or an echo-chamber in a sharing network. A common technique to analyze community structure, as well as other structural features, is the use of random walks. In this technique, one imagines a particle that randomly walks in the graph by simply moving from object to object by following a random connection at each step. Despite the simplicity of this technique, it forms the foundation for state-of-the-art community-detection and graph-sampling methods; however, although there is a rich and deep mathematical theory on random walks, there is a lack of understanding for the success of this technique. In particular, the current theory on random walks essentially uses measures of "bottlenecks" (called conductance), and shows that random walks are effective when there are no bottlenecks. But there is overwhelming empirical evidence that real-world graphs contain bottlenecks, yet random walks are effective for analyzing them. The main aim of this research is a scientific investigation of this phenomenon, with the hope of finding the right mathematical tools to explain this behavior. Given the central role that massive networks play in modern society, such studies play a fundamental role in scientific research.

It has been recognized in earlier work that the classic notion of conductance is too crude a lens to understand real-world graphs. The aim of this research is to design richer conductance measures to study the behavior of random walks, design provably robust algorithms to approximate these measures, and demonstrate the relevance of these measures for algorithmic problems in graph sampling. The starting point for the investigation is a "truncated" notion of conductance that ignores small sets, introduced in the discrete math literature to study volumes of convex bodies. The investigators believe this to be a more useful characterization of random walks on real-world graphs. This leads to a number of research challenges. The first challenge is to design efficient algorithms that approximate these richer conductance measures. The second challenge is to prove that existing empirical heuristics are exploiting these other conductance measures, to get performance better than that predicted by previous theory. The third challenge is to perform a detailed study of these measures on real-world graphs in order to empirically ground the theory. One of the by-products of this research will be a greater insight into the actual structure of real-world graphs, and this will likely inspire better models. The primary outcomes from this research will be in the form of theorems and algorithms, as well as papers describing them, that characterize the impact of richer conductance measures on the behavior of algorithms run on networks. The investigators also plan to release software to compute or approximate the new conductance measures proposed.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Project Start
Project End
Budget Start
2019-10-01
Budget End
2022-09-30
Support Year
Fiscal Year
2019
Total Cost
$250,000
Indirect Cost
Name
Purdue University
Department
Type
DUNS #
City
West Lafayette
State
IN
Country
United States
Zip Code
47907