Networks are ubiquitous in the modern world, with social networks and the World Wide Web as well-known examples. Understanding the structure of networks is critical to understanding real-world phenomena, making predictions, and helping inform decisions on, for example, strategies to disrupt and dismantle terrorist networks, curb the spread of infectious diseases, and reduce systematic risk in financial markets. To help understand and predict such phenomena in the face of uncertainty, professionals need statistical models which are both complex and scalable (i.e., models which are capable of modeling a wide range of network characteristics and which can be applied to large networks). Existing models are either scalable but simplistic or complex but not scalable. This research project will develop the first generation of models which are both complex and scalable. The developed models and methods will have applications in a wide range of areas, including national security (e.g., insurgencies, terrorism), public health (e.g., the spread of infectious diseases), and finance (e.g., systematic risk in financial markets).

This research project will develop the next generation of random graph models which are both complex and scalable by merging the two most important streams of statistical network analysis, stochastic block models and exponential-family random graph models, with a view to reducing the disadvantages of each while retaining the advantages of both. Stochastic block models are scalable but simplistic, whereas exponential-family random graph models are complex but not scalable. The next-generation models studied here bridge the gap between complexity and scalability and are both complex and scalable. In addition to elaborating next-generation models, this research will address the unique computational and theoretical challenges raised by next-generation models. The computational challenge of estimating next-generation models will be addressed by taking advantage of model structure, including local dependence and local convexity properties, and by exploiting massive-scale minorization-maximization methods which break down the high-dimensional optimization problem into low-dimensional ones that can solved in parallel. The theoretical challenge of studying the properties of estimators will be addressed by exploiting novel concentration of measure inequalities. The concentration of measure inequalities will take into account the dependence inherent in networks as well as the lack of smoothness of estimators.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
1513644
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2015-08-01
Budget End
2019-01-31
Support Year
Fiscal Year
2015
Total Cost
$200,001
Indirect Cost
Name
Rice University
Department
Type
DUNS #
City
Houston
State
TX
Country
United States
Zip Code
77005