The investigator and his colleagues study the large-scale structure of real-world networks, such as social, biological, and information networks. Several specific projects will be undertaken. The first will focus on the classic problem of community detection in networks and the development of compact objective functions for assessing community quality and effective strategies for their optimization. The methods proposed improve on previous ones in being rigorously motivated, utilizing as much as possible of the information stored in a network, and avoiding some of the pitfalls of previous approaches, such as resolution limits. A second project will develop methods for general structural inference in networks based on a combination of expectation--maximization (EM) algorithms with Monte Carlo and belief propagation methods. A particular strength of these approaches is that, because they optimize marginal probabilities, they have the potential to avoid overfitting and offer strategies for solving model selection problems. Several further projects focus on the extension of these techniques to more complex forms of network structure, including overlapping vertex classes, hierarchical structure, and dependence on general classes of hidden variables.
Many systems of scientific interest can be represented as networks, from the Internet to biological networks to social networks. In the last decade copious amounts of network data have become available in different fields, but in many cases these data are dense and bewildering, presenting a substantial challenge to the user trying to make sense of their structure. The fundamental question addressed in this research is how we can gain new understanding about physical, biological, and social systems from their network structure. For instance, the investigator and his collaborators will develop methods for detecting groups of closely interacting nodes in networks, such as cliques of individuals in social networks, or groups of related web sites on the web. They will also tackle general questions of using network structure to deduce unknown properties of network nodes. If, for instance, there are several different types of nodes in a network, such as classes of chemicals in a metabolic network in the cell, how can we identify the types from network structure alone, without measuring them directly (which may be difficult or costly)? The funded research will create a range of mathematical tools that can be applied to problems in this area. The methods to be developed have a number of possible applications, including new techniques to aid researchers working on biological and social networks, commercial applications in online social networks, and software applications in data visualization and automatic data summarization, among others.