This project will investigate a new way to collect network data that would make answering social scientific questions involving social networks faster and more cost-effective. Networks are widely recognized across multiple disciplines as critically important for understanding how information spreads (communications), how illnesses spread (public health), how legislation is passed (political science), how norms and customs form (sociology), how companies and governments coordinate economic activities (economics & geography), how people form and maintain relationships (developmental psychology), and how threats to national security can be disrupted (intelligence). Collecting data on these types of networks is often costly and impractical, which has limited progress in answering these questions. Rather than attempt to measure directly the interactions between people (or companies, or cells, etc.), the new methods developed in this project will measure interactions indirectly by looking at patterns of similar behavior. For example, it would be impractical to ask everyone in a large city who they talk with, but we may be able to infer that two people talk with one another if they routinely attend the same events. This project will compare several different methods for making these inferences to determine which method (if any) is the most accurate. Identifying accurate methods for indirectly measuring networks is important because it will reduce the time and economic cost of answering research questions, such as those noted above, that have scientific and societal benefits.

The inferential methods investigated in this project are known as methods of backbone extraction. Given a bipartite projection network in which two agents (e.g., people) share some artifacts in common (e.g., they attend the same events), these methods determine whether they share enough artifacts in common to warrant the inference that they are linked (e.g., they interact with each other). Methods for making such inferences differ primarily by whether they control for nothing (e.g., an unconditional threshold), for the number of artifacts each agent has (e.g., a hypergeometric threshold), or for both the number of artifacts each agent has and the number of agents each artifact has (e.g., the fixed and stochastic degree sequence models). After implementation of several existing methods in a publicly available package for the R software and evaluation of their computational complexity, their ability to accurately recover known (i.e., ground truth) networks with varying structural properties from synthetic bipartite data that contain varying amounts of noise will be compared. In addition, the performance of these methods using two publicly available benchmark empirical bipartite datasets that have known ground truths will be tested.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Social and Economic Sciences (SES)
Type
Standard Grant (Standard)
Application #
1851625
Program Officer
Joseph Whitmeyer
Project Start
Project End
Budget Start
2019-05-15
Budget End
2021-04-30
Support Year
Fiscal Year
2018
Total Cost
$119,904
Indirect Cost
Name
Michigan State University
Department
Type
DUNS #
City
East Lansing
State
MI
Country
United States
Zip Code
48824