As computer power continues to increase, the complexity of simulations also increases thereby producing datasets of unprecedented size. Without effective analysis tools, results from these large-scale simulations cannot be utilized to their fullest extent. This research addresses the problem of large-data visualization and exploration by employing interactive multi-scale machine learning, which exploits an efficient feature-based, multi-resolution representation of the data. The investigators are leveraging methods from the field of machine learning to perform two distinct tasks: identify regions of interest and enhance robustness of feature detection algorithms. The primary outcome of this effort is the realization of a framework for exploring large datasets. Further, this work is introducing a large body of work in machine learning to the field of visualization. Successful completion of this research will help overcome the brittleness of existing visualization methods and foster expedient discovery in many areas of science and engineering.

The multi-resolution techniques developed here will employ a two-fold strategy. First, semi-supervised learning based on training with the domain expert is used to develop strategies for selective spatial and temporal refinement of the data. A classifier is constructed to tag the output of the coarse resolution feature detection (i.e. regions) as either interesting or not interesting. Then at the finest scale, interesting local data chunks containing features of interest are identified for further analysis. Second, several local feature detection algorithms, or weak classifiers, are combined into a single, more robust compound classifier using adaptive boosting, or AdaBoost, and a data adaptive variant called CAVIAR that facilitates validated feature detection. Ideally, the compound classifier combines the best of all weak classifiers as they respond to the underlying physical signal. This research is demonstrating the effectiveness of these methods by applying existing local detection algorithms for visualizing vortices in turbulent flow fields.

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1065081
Program Officer
Ephraim Glinert
Project Start
Project End
Budget Start
2011-06-01
Budget End
2015-05-31
Support Year
Fiscal Year
2010
Total Cost
$303,444
Indirect Cost
Name
University of Florida
Department
Type
DUNS #
City
Gainesville
State
FL
Country
United States
Zip Code
32611