Modern direct manipulation and visualization systems have made key strides in bringing powerful data transformations and algorithms to the analyst's desktop. But to further promote the vision of powerful visual analytics, wherein automated algorithms and visual representations complement each other to yield new insight, we must continually increase the expressiveness with which analysts interact with data. This project focuses on the task of storytelling, that is to say the stringing together of seemingly unconnected pieces of data into a coherent thread or argument. To support storytelling, which requires both human judgment and algorithmic assistance, the PIs will first develop a new theory of relational redescriptions that provides a uniform way to describe data and to compose data transformation algorithms across a multitude of domains. Using this theory, the PIs will be able to define stories formally as compositions of relational redescriptions. They will develop scalable and steerable algorithms for storytelling that will respond to dynamic user input, such as preferences and constraints, and they will contextualize their use in interactive visualizations that harness the power of spatial layout. Finally, they will investigate how analysts engage in sense-making using the new storytelling algorithms and visualizations, in the hope of finding answers to questions such as: How do analysts achieve insight and advance their conceptualization of patterns derived from datasets? Project outcomes will include the formal conceptualization of storytelling as well as the compositional approach to building complex chains of inference.
Broader Impacts: This research will make it easier for analysts to interactively explore connections in large-scale heterogeneous datasets. The PIs will work with the FODAVA-lead team at Georgia Tech and PNNL's NVAC to investigate applications of relational redescriptions and storytelling to domains of interest to NSF and DHS, and will develop in consultation with real users across these groups a layered software framework for storytelling (both analysis and visualization) capabilities; the framework will be released into the public domain under the GNU GPL/Lesser GNU GPL license, and APIs will be provided that allow analysts to tailor it to suit their needs. Although this project will focus on cyber-analytics scenarios such as those motivated by the VAST 2009 challenge, project outcomes will generalize across other domains such as bioinformatics, systems biology, electronic commerce, and social networks. The unified notion of redescriptions will help integrate multiple data sources (numeric, symbolic, textual, and categorical), and situate them on a common footing for visual analytics; it will also enable visual analysts from different application domains to use a common vocabulary while interacting with one other.