Scientific data increasingly takes the form of networks, but the ability to collect and process graph data has out-paced our ability to analyze them statistically. Most of the critical scientific questions about networks revolve around comparisons between networks (e.g., over time or across experimental conditions). Such problems arise in fields as different as neuroscience, epidemiology, economics, climatology and criminology. Currently, network analysts compare only basic descriptive statistics (e.g., the average distance between nodes in the graph), ignoring issues of global structure and statistical validity. We will develop a rigorous statistical theory of network comparisons. Our approach rests on recent develops in network theory which show how large graphs approximate continuous geometric objects, so that tools for geometric comparisons can be applied to networks.

Our project will develop rigorous statistical methods and efficient algorithms for network comparisons. The first step is the flexible non-parametric estimation of continuous network models, where we will pursue three complementary strategies, using regression smoothing, density estimation in non-Euclidean latent spaces, and ensembles of trees. Having represented networks as continuous stochastic processes, we will develop statistical theory and methods for detecting and characterizing differences between such processes. Interdisciplinary proof-of-concept applications, including those in public health (through online social networks), finance (through financial networks), neuroscience (through brain connectivity networks), genetics (through gene regulatory networks), and proteinomics (through protein interaction networks), will demonstrate the power of the geometric approach in comparing large and disparate sample data.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1418124
Program Officer
Yong Zeng
Project Start
Project End
Budget Start
2014-09-15
Budget End
2017-08-31
Support Year
Fiscal Year
2014
Total Cost
$261,799
Indirect Cost
Name
Carnegie-Mellon University
Department
Type
DUNS #
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213