The burgeoning field of "Social Network Analysis" focuses on extracting useful insights from such social network data. Implemented or envisioned applications range from learning about the nature and driving forces behind human interactions, to targeted product or activity recommendations and even homeland security. Contrary to other networks, such as transportation or computer networks, massive uncertainty and noise are practically always associated with social network data: data pertaining to individuals are often not observable, or are observed incorrectly. The primary goal of this project is to understand the risks and implications of such noisy data, and to design network analysis algorithms that are significantly more robust to noise and missing data. Given the importance that mathematical models play in social networks analysis, a closely related thread of the project is to analyze the fit between typical social network models and real-world data, in particular regarding high-level connectivity properties. The project website will be used to disseminate research prototypes and data that are collected as part of the project.

Specifically, three connected research thrusts that integrate the PIs' expertise in machine learning and theoretical computer science will be explored: (1) How well do standard random graph models fit real-world social network data, in particular with regard to expansion and spectral properties? Since the answer likely is "poorly," how well do modifications based on requiring local or global structure remedy this problem? (2) What is the impact of missing observations of diffusion or activation processes on the inferred social networks when learning from some contagious behavior? How can this impact be mitigated by algorithms that take the possibility of missing data into account? (3) If social network data are observed with significant (and possibly non-random) noise, under what conditions can stability of an algorithmic output be ensured? How "obvious" does the right answer have to be to not get obscured by noise in the data? Can "obvious" answers be found more efficiently? The proposed research has the potential to impact the way in which social network inference and optimization are addressed. The PIs are committed to a suite of activities, among them inclusion of undergraduate students in the proposed research and outreach to local high school students, for broader impacts.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1619458
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2016-09-01
Budget End
2020-08-31
Support Year
Fiscal Year
2016
Total Cost
$507,996
Indirect Cost
Name
University of Southern California
Department
Type
DUNS #
City
Los Angeles
State
CA
Country
United States
Zip Code
90089