Network and graph data appear in many diverse applications such as social networks, biological networks, and mobile ad-hoc networks. In many cases, there is an inherent uncertainty in the available graph data either due to the data collection process or the preprocessing of the data. Furthermore, uncertainty or imprecise information becomes a critical impediment to understanding and effectively utilizing the information contained in such graphs. In this project, we address the problem of managing and mining such uncertain graphs. To do that, we adopt the possible-world semantics to probabilistic and uncertain graphs and within this framework we study algorithms for well-established data-mining problems. Motivated by real-life applications, we focus on specific data-analysis tasks. However, we make our framework general enough so that it can be used for a wide set of tasks and applications. In particular, we develop scalable and efficient algorithms and approaches to address a number of important tasks including nearest neighbor retrieval, clustering and partitioning, finding important nodes and edges, and summarizing large uncertain graphs.

We expect the results of this project to have an impact on several application domains. For example, they can help internet-based and social-media related companies to analyze their data and improve their targeting advertisement policies and practices. This will create opportunities for making these companies viable and help the economy of internet-based business. In medicine, biology and biochemistry, networks play a very important role and many of these networks can be better modeled as uncertain networks. Our project can help analyze these networks and lead to new biological insights. The results of this project are disseminated as follows: (1) we develop publicly available prototypes; (2) we include the results of our work in our classes/lectures; (3) we communicate our results to scientists of computer science and other fields and our industry collaborators through publications and demos. We also actively try to engage in our research graduate and undergraduate students, including women and minorities.

For further information see the web site at: www.cs.bu.edu/~gkollios/ugraphs/

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1320542
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2013-09-01
Budget End
2017-08-31
Support Year
Fiscal Year
2013
Total Cost
$500,000
Indirect Cost
Name
Boston University
Department
Type
DUNS #
City
Boston
State
MA
Country
United States
Zip Code
02215