The goal of this research project is to devise new visualization tools to help scientists gain insight from their high-dimensional data. High-dimensional data are observations with many attributes, on the order of 100s and more. Today's data are often inherently high-dimensional: DNA microarrays, financial tick-by-tick data, hyper-spectral imagery, just to name a few. The challenge in visualizing these data comes from the limited dimensionality of the screen. Traditional data visualization paradigms have inherent inabilities to fully map high-dimensional properties to a two-dimensional display without loss of inherent semantics, patterns or structure. This can lead to ambiguous and even misleading visualizations. To overcome this fundamental chasm, the display system developed in this project uses methods gleaned from illustrative design to communicate these elusive properties, derived from analysis in the high-dimensional data space. A second important motivation of this research is that this illustration-inspired approach are expected to produce visualizations that are easier to interpret and manipulate.
The overall theme of this work is to use information abstraction and illustrative mappings to improve display comprehensibility, reduce unnecessary complexity, and communicate high-dimensional data patterns more faithfully. The illustrative framework is driven by a two-pronged data analysis suite that uses filtering to create a data representation at multiple levels of scale and pattern classification to identify suitable appearance illustrations. Both of these analyses are performed in the native high-dimensional data space to preserve the original structures. Various illustrative styles are linked to visual semantics to provide an intuitive data display. The generality of our framework allows it to readily map to the three most prominent high-dimensional visualization platforms: space embeddings, parallel coordinates, and scatter plots. Illustrative visualization design and validation is carried out in collaboration with experts in Environmental Science and the Human Microbiome Project.
The system is designed to support domain scientists in knowledge discovery, but also appeal to casual users by supporting data analysis via illustrative design. The display looks more natural since it uses familiar graphics design paradigms to construct the illustrative visualizations. The project webpage (www.cs.sunysb.edu/~mueller/IllustratorND) provides information on ongoing progress, invites participation in user studies, and also provides some data analysis capabilities within a web-enabled version of the software. The project offers educational and research opportunities for students.