The long-term research goal is to develop novel data mining technologies to elucidate the structures and dynamics of complex but ubiquitous networks. A complex network is a large system of elements (vertices) that are joined by non-trivial relationships (edges). Examples of such complex networks include the WWW, metabolic and protein networks, social networks, and economic and financial markets. The underlying principles and laws of these network systems can help us construct more effective communication mechanisms, find cures for fatal diseases, and deal with economic crises.

In spite of the significant advances that have been made towards understanding the fundamental laws that govern the structure and behavior of complex networks, there is still a disconnect between current analytical techniques and their applicability to real-world complex networks. A principled approach is lacking to systematically analyze a single large complex network and link system behaviors to network structure. There is also immediate and crucial need for a theoretical framework to understand the relationships between multiple networks, which is the key for comparative network analysis. How to integrate and leverage the rich system data, such as measurement time series, associated with network topology to study complex systems is still an open question. This project addresses these issues by: 1) developing novel graph and information theoretical approaches to extracting network backbones which both simplify and highlight network structures; 2) developing information theoretical network distance measures and clustering algorithms for comparative network analysis; 3) applying causality inference and network modularity to integrate time series with network topology. The proposed mining methodologies build upon an innovative blend of graph theoretical, information theoretical, and statistical learning concepts and techniques, and can greatly expand the reach of data mining. This project will also help us better analyze the emerging complexity, heterogeneity, and large scale of real-world complex network data.

In a close collaboration with domain experts from social and political sciences, software engineering, and bioinformatics, the proposed techniques have the potential to help understand how human society is organized at the individual level (social networks) and organizational level (political science); illuminate how large scale software systems form and evolve; reveal the organizational principles of biocellular systems in a dynamic environment; and identify therapeutic or drug targets. Using the popular online social networks, such as MySpace and Facebook, as ``hooks'', this project will attract, recruit, and prepare students from underrepresented groups including women and minorities to computer science and involve underrepresented students in the cutting-edge research.

For further information see the project web page at: www.cs.kent.edu/~jin/NSFCAREER/

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
0953950
Program Officer
Vasant G. Honavar
Project Start
Project End
Budget Start
2010-04-01
Budget End
2015-03-31
Support Year
Fiscal Year
2009
Total Cost
$438,539
Indirect Cost
Name
Kent State University
Department
Type
DUNS #
City
Kent
State
OH
Country
United States
Zip Code
44242