The growing power of supercomputers provides significant advancements to the scientists' capability to simulate more complex problems at greater detail, leading to high-impact scientific and engineering breakthroughs. To fully understand the vast amounts of data, scientists need scalable solutions that can perform complex data analysis at different levels of detail. Over the years, visualization has become an important method to analyze data generated by a variety of computationally intensive applications. The selection of visualization parameters and identification of important features, however, are mostly done in an ad-hoc manner. To enable the user to explore the data systematically and effectively, in this collaborative research effort involving the Ohio State University and the Michigan Technological University, the PIs explore an information-theoretical framework to evaluate the quality of visualization and guide the selection of algorithm parameters.
The research team plans to develop a four-tier analysis framework based on information theory. The bottom tier of the framework consists of the components of information measures where data are modeled as probability distributions. Based on the information measurement components, in the tier two of the framework the most common visualization algorithms including isosurface extraction and flowline generation are evaluated and optimized to effectively reveal the most amount of information in the data. The PIs also investigate issues related to information measurement in image space and optimize the direct volume rendering results. The tier three of the framework is focused on the analysis of time-varying and multivariate data sets. Methods for identifying important spatio-temporal regions in time-varying data sets and to measure the information flow in multivariate data sets to identify the causal relationship among different variables will be developed. In the fourth tier of the framework, the information theory is used to assess the quality of different levels of detail in multi-resolution volumes and images, and to select the level of detail to optimize the visualization quality while satisfying the underlying performance constraints.
The key accomplishment of this project will be the development of a rigorous information theory based solution to assist scientists in comprehending the vast amounts of data generated by large-scale simulations and effective visualizations. To target the research at real world applications, the PIs are collaborating with the combustion scientists at Sandia National Laboratories who are at the forefront of their field to employ extreme-scale computing to solve the most challenging problems. The four-tier information-theoretic framework will be implemented using the Visualization Toolkit (VTK), which is to be released to general users. New algorithms and techniques developed in the project will be disseminated through the project web site (www.cse.ohio-state.edu/~hwshen/Research/NSF_GV2010), presentations at the annual visualization and application-specific conferences that the PIs have been actively participating in. Dissemination plan will also includes reaching general audiences through news, stories, and presentations to enhance their understanding and appreciation of the value of visualization. This project provides training to graduate, undergraduate, and underrepresented students in the area of computational science and large-scale data analysis and visualization.
The key accomplishment of this collaborative project is the development of a rigorous information theory based solution to assist scientists in comprehending the vast amounts of data generated by large-scale simulations. We have successfully promoted a new way to conduct scientific data analysis and visualization using information theory in many applications, including scalar and vector field data, time-varying multivariate data. This information-theoretic approach has now been widely accepted by the visualization community as a variable solution to quantify information in the data analysis and visualization. Throughout this project, PI Shen at Ohio State University has published a total of 13 top journal and conference papers, and graduated three Ph.D.s under this project. He has also given several invited talks and summer school courses. PI Wang at Michigan Tech has published a total of nine journal and conference papers, given nine invited talks at research institutions and laboratories, and advised six PhD, MS, and undergraduate students. PIs Shen and Wang organized a tutorial at IEEE VisWeek 2013 and disseminated our research results from this project through tutorial presentation and discussion. To target the research at real-world applications, the PIs have collaborated with climate and combustion scientists who are at the forefront of their field to employ extreme-scale computing to solve the most challenging problems. The PIs have also successfully reached out to middle and high school students through summer youth programs (Women in Computer Science, Women in Engineering, and Engineering Scholars Program) and open house events, inspiring them to pursue visualization research in the future.