This EAGER award creates an interoperability test bed to identify the components of an effective layered architecture for geoscience and environmental science research. In a layered architecture, every layer consists of different technologies, each of which uses different interaction protocols. The proposed project will examine a wide variety of existing technologies in terms of their effectiveness in working across present data silos. These technologies include data grids, workflow systems, policy management systems, web visualization services, and security protocols that work with various repository catalogs. Project goals are focused on developing cyberinfrastructure tools and approaches that allow geoscience data repositories to enable new science and more effectively make their data holdings discoverable and available to the public. Essential elements of the project include the collection and comparision of various approaches and existing tools to check effectiveness in handling and integrating geoscience data, and by automating processes needed to integrate various databases and data types. The project is led by a team of experts in cyberinfrastructure and geoscience data management and employs a spiral softwar3ee development approach. Broader impacts of the work include building infrastructure for science in order to facilitate data-enabled science in the geosciences. It will also produce results that are likely to be applicable to fields outside of the geosciences. The effort supports a larger NSF effort to establish a new paradigm in the development of an integrative and interoperable data and knowledge management system for the geosciences for a new NSF initiative called EarthCube.

Project Report

(Award #1239603, PI – Shaowen Wang), was to explore standard interoperability mechanisms for building data cyberinfrastructure for various geoscience domains. Each domain has developed community cyber-resources such as data repositories, information catalogs, and data manipulation services. Examples include a precipitation database created by the Consortium of Universities for the Advancement of Hydrologic Science, Inc., and climate data records stored at the National Climatic Data Center. Each community resource is accessed through web services, manages different types of data formats, and uses different vocabularies for describing the data. This award explores the interoperability mechanisms that link the resources to collaboration environments. The Intellectual Merit achievements of the project include: 1) identified a loosely coupled federation architecture that enables integration of community resources with collaboration environments, 2) demonstrated interoperability mechanisms that encapsulate the knowledge needed for distributed data access or processing, 3) identified multiple interoperability mechanisms appropriate for geosciences, 4) demonstrated how to integrate the interoperability mechanisms into scientific workflows, and 5) demonstrated the ability to support reproducible data-driven research through the registration and sharing of scientific workflows. Multiple demonstrations of interoperability mechanisms were developed for interacting with community resources by linking them to a collaboration environment. The community resources included workflow systems, data repositories, information catalogs, and analysis services. The work done under this award is already having a Broader Impact. The approach developed by this project has been applied to address interoperability challenges in other NSF data management projects and geospatial scientific tools. The use of collaboration environments as the unifying middleware is also being leveraged by other disciplines, including plant biology, astronomy, astrophysics, high energy physics, genomics, neurosciences, and cognitive science. The outcomes of this EAGER award include: · Identification of the use of collaboration environment as middleware that enables interoperability. The collaboration environment serves as a unifying infrastructure that links community resource to compute resources. The collaboration environment enables sharing of data and workflows, and the re-execution of workflows. · Identification of mechanisms to enable reproducible data-driven research. This includes the automation of data retrieval from community resources, the automated transformation of the data to formats required for scientific analyses, and the archiving of analysis workflows, input files, and output files. Workflows can be executed within a researcher’s local computing environment, or the NSF XSEDE. . Demonstration of multiple interoperability mechanisms at the First International Conference on Space, Time, and CyberGIS (CyberGIS’12: www.cigi.illinois.edu/cybergis12/). The demonstrations used cyberinfrastructure resources provided by the NSF CyberGIS, OpenTopography, and XSEDE projects. The mechanisms that were implemented included re-usable functions (micro-services) that encapsulate the knowledge needed to interact with community resources and retrieve data, and micro-services that encapsulate interactions with scientific workflow systems.

Agency
National Science Foundation (NSF)
Institute
Division of Earth Sciences (EAR)
Type
Standard Grant (Standard)
Application #
1239603
Program Officer
Barbara Ransom
Project Start
Project End
Budget Start
2012-04-01
Budget End
2013-03-31
Support Year
Fiscal Year
2012
Total Cost
$18,000
Indirect Cost
Name
University of Illinois Urbana-Champaign
Department
Type
DUNS #
City
Champaign
State
IL
Country
United States
Zip Code
61820