The goal of this research project is to help people make sense of large graphs, ranging from social networks to network traffic. The approach consists of combining two complementary fields that have historically had little interaction -- data mining and human-computer interaction -- to develop interactive algorithms and interfaces that help users gain insights from graphs with hundreds of thousands of nodes and edges. The goal of the project is to develop mixed-initative machine learning, visualization, and interaction techniques in which computers do what they are best at (sifting through huge volumes of data and spotting outliers) while humans do what they are best at (recognizing patterns, testing hypotheses, and inducing schemas). This research addresses two classes of tasks: first, attention routing -- using machine learning to direct an analyst's attention to interesting nodes or subgraphs that do not conform to normal behavior. Second, sensemaking -- helping analysts build in-depth representations and mental models of a specific areas or aspects of a graph. Evaluation of the tools will involve both controlled laboratory studies as well as long-term field deployments.
As large graphs appear in many settings -- national security, intrusion detection, business intelligence (recommendation systems, fraud detection), biology (gene regulation), and academia (scientific literature) -- the potential benefits of new tools for making sense of graphs is far reaching. Project results, including open-source software and annotated data sets, will be disseminated via the project web site (http://kittur.org/large_graphs.html) and incorporated into educational activities.