The investigators will develop a unified nonparametric theoretical framework to study stochastic network models and design scalable algorithms (and software) to fit these models. They intend to carry out the development and validation of their methods with collaborators in biology who have gathered extensive new data for the assessment of protein structure and determination of biological pathways particularly in Drosophila. Such problems are omnipresent in genomics and they expect their methods to carry over widely. They will also use networks with uncertainty measures to study relationships between words and phrases in a newspaper database in order to provide media analysts with automatic and scalable algorithms. In all cases they will focus on methods of general statistical confidence which have been lacking in work so far.

Our world is connected through relationships, among "actors" who can be people, organizations, words, genes, proteins, and more. Advancements of information technology have enabled collection of massive amounts of data in all disciplines for us to build relationships between these actors. These relationships can be effectively described as networks, and properties or patterns in these networks can be random or knowledge. Responding to this recent data availability and a huge potential for knowledge discovery, research in networks is attracting much attention from researchers in physics, social science, computer science, and probability. While contributing to the development of core statistical research, the proposed research will directly impact the interdisciplinary field of network analysis and the study of complex networks. The applications of their research results are diverse and well beyond the two fields studied in the proposal: genomics and media analysis. They include national security, communications, sociology, political science, and infectious disease. The statistical tools developed are unifying and could change how many scientists approach network analysis. As a result, statistical research will become more prominent in the networks community.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
1160319
Program Officer
Gabor Szekely
Project Start
Project End
Budget Start
2012-06-01
Budget End
2017-05-31
Support Year
Fiscal Year
2011
Total Cost
$1,199,916
Indirect Cost
Name
University of California Berkeley
Department
Type
DUNS #
City
Berkeley
State
CA
Country
United States
Zip Code
94710