As scientists anticipate the benefits of exascale computing, the lack of novel solutions to process data at scale and calibrate the simulation parameters has become a significant roadblock to further accelerating scientific discovery. The goal of this project is to develop a new end-to-end data analysis and feature extraction workflow based on deep neural networks to help computational scientists address three major challenges: (1) identify important simulation parameters and generate the essential data for analysis, (2) transform the simulation data to compact feature representations to convey the most insight, and (3) design scalable visualization algorithms coupled with large-scale simulations to glean insight into their scientific problems. Working with domain scientists in jet engine design, climate models, cardio/cerebrovascular flow, superconductivity, and fusion energy, the team will demonstrate how deep learning techniques can help extract features from vast amounts of simulation data and navigate in the huge simulation parameter space. Through summer internships and project collaborations, this research will create opportunities for graduate and undergraduate students, including students from underrepresented groups, to participate in key research initiatives with leading scientists. Through the planned annual summer school on "Deep Learning for Visualization," the research results will enable visualization researchers and a broader community to incorporate the principles and practice of deep learning techniques developed.

The research team will develop a comprehensive analysis framework that encompasses a suite of state-of-the-art deep learning techniques for in situ processing and analysis of large-scale scientific simulation data. The framework will consist of three tightly-integrated components: (1) analysis of simulation parameters and data reduction, (2) post-analysis of data and features, and (3) in situ workflow optimization. For the first component, methods will be developed to assist simulation surrogate creation, parameter space exploration, and comparative analytics of ensemble simulations. For the second component, deep learning techniques will be developed to learn features from data for interactive exploration of representatives and to upscale reduced simulation output in the spatial and temporal domains. For the third component, in situ solutions will be developed for feature detection, workload estimation, and feature computation surrogates. The framework will be evaluated using four types of quantitative metrics: data reduction ratio, data-, feature-, and image-level error measures, scalability measures, and cross-validation with training and testing data. The team will work closely with scientists in the domains of jet engine, climate, cardio/cerebrovascular flow, superconductivity, and fusion energy. The domain scientists will play a critical role in enabling the research team to understand the requirements of their applications and to evaluate the outcomes of this research. The project's dissemination plan will address a much broader audience, including students, practitioners, and domain scientists, to enhance their understanding and appreciation of the value of deep learning for visualization. The team will release open-source software, pre-trained models, and training and test data generated from this research, including auto-encoder for feature learning, DNN-assisted parameter space exploration, CNN-based feature extraction and tracking, and load-balancing based deep predictive models.

This award includes funding from the Information Integration & Informatics Program in the Division of Information & Intelligent Systems, Software & Hardware Systems Program in the Division of Computer & Computing Foundations, and the NSF Office of Advanced Cyberinfrastructure.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1955764
Program Officer
Hector Munoz-Avila
Project Start
Project End
Budget Start
2020-06-01
Budget End
2024-05-31
Support Year
Fiscal Year
2019
Total Cost
$332,720
Indirect Cost
Name
Ohio State University
Department
Type
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210