Technological innovations have provided a primary force in advancement of scientific research and in social progress. Large scale network data and relational data are frequently encountered in genomics and health sciences, economics, finance and social media. The proposed project will (1) enhance methodological and theoretical developments for statistical analysis of network and relational data. (2) advance the understanding of the community structure of social network. The research emanating from this grant will advance the frontiers of theory and methods for network data modeling. The new developments will provide better understandings of large scale network data for researchers from diverse fields of sciences and humanities, e.g., understanding the social behavior of individuals and the dynamic nature of social network.

The proposed project has the following three interrelated objectives under the theme of statistical inference for large scale network and relational data. (1) To introduce a new framework for community detection with covariate information. There have been many existing approaches to community detection. However, a majority of them focus on analyzing the network without considering the covariate information, which could be valuable for achieving greater accuracy of community detection. The goal of this research is to study when and how will covariate information help in terms of the community detection accuracy. (2) To develop a new dynamic stochastic block model framework with applications in change point detection. The stochastic block model along with its variants are usually defined for a static network. The goal here is to define a dynamic version of the stochastic block model, with a clear interpretation of how the network evolves over time. A general dynamic spectral clustering method will be proposed and its theoretical properties established. The important problem of change point detection of the dynamic network will be studied in details. (3) To introduce a conditional dependency measure with applications in undirected graphical models. It is of fundamental interest to ascertain variables or factors underlining the network dependency structure. The goal is to introduce a flexible conditional dependency measure, which can capture a wide range of different dependency structures. The PI will develop a new method for generating a general undirected graph with desirable features by making use of the resulting conditional dependency measure.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Application #
2013789
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2019-11-01
Budget End
2022-06-30
Support Year
Fiscal Year
2020
Total Cost
$151,307
Indirect Cost
Name
New York University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10012