Interactive exploration effectively enables scientists to identify and interpret data, embedded in much larger datasets, that represents significant underlying phenomena. There is a critical need for data management support for the process of interactive exploration of very large data sets. However, scientific database research has not yet had a major impact on the way that scientists actually use scientific data. The lack of a comprehensive conceptual model for the scientific data has limited the success of current systems in becoming more general and less ad hoc.
The principal goal of this research is to develop, validate, and prototype a formal data model for distributed multi-resolution scientific data that encapsulates its inherent structure to guide efficient database implementation. The model includes comprehensive geometry and topology-based features for describing a wide variety of sampling grids, features for representing sub-domains of a dataset that contain discovered knowledge, and error measures that reflect the accuracy or authenticity of each representation level. From a repository containing a dataset represented as a multi-resolution hierarchy, a scientist accesses successive levels of error-annotated detail to zoom into the meaningful areas, downloading data from these regions to a LAN and his/her workstation for further