Geoscience in the 21st century is increasingly data driven. Easy, reliable access to trustworthy, quality-controlled data on the Earth and its environment is critical for enabling the mining and analysis of multi-disciplinary datasets that make scientific breakthroughs happen and increase our knowledge of the Earth and the things that happen on and below its surface. The increasing number of large-volume, automatically-generated datasets; the need to convert important legacy data and datasets that are presently available only on paper in journals and other scientific publications; the need to educate students, faculty, and the public in proper data management and curation so they can access already-collected data or make their data useable by others; and the need to improve discovery of existing legacy data and combine information from disparate data types or that are in different formats is a major challenge not only for geoscience but for other fields of science as well. As a result, geoscience data increasingly need to be professionally managed and curated to both make these data easily and publicly available and to maximize their potential for use for research which can provide benefits to society. This funding renews support for the Interdisciplinary Earth Data Alliance (IEDA), one of the premiere, NSF-funded, solid-earth, data repositories. IEDA provides discovery and public access to a large volume and variety of NSF-funded data as well as that from a number of other sources so these data can be used and reused by anyone. IEDA was established in 2009 from a series of independent data management/curation activities to provide a single clearinghouse with shared cyberinfrastructure and tools to help solid-earth, NSF-funded, researchers from the ocean, earth, and polar geosciences deposit their data for public access and to discover and reuse data collected by themselves and others to do new science. IEDA develops and operates the necessary databases, software tools, and services that support investigators with data stewardship and access throughout the full data life cycle, with a special focus on disciplines that typically generate and use complex, heterogeneous, structured and unstructured datasets that are particularly challenging to manage and combine. IEDA hosts and serves data that includes marine seismic data and bathymetry; rock and seafloor sediment and hydrothermal vent fluid geochemistry; geochronology; Antarctic research; information about physical samples; and other marine and earth science data. It also has developed and deployed map-based data discovery tools and compiled data products that enable quick identification of data and/or datasets of interest. Researchers from across the US and around the world have used IEDA data and tools and combined them in novel ways to study a wide variety of topics, such as providing new insights on the evolution of mid-ocean ridges and processes happening on continental margins, seafloor and hotspot processes, mantle geodynamics, geohazards, sediment transport between the continents and oceans, global geochemical cycles, and Earth surface processes. All of these would have been impossible or difficult to do in a timely manner without the use of IEDA data holdings and discovery tools. In addition to the broader impacts of promoting public access to important NSF-funded and other geoscience data, other broader impacts of the facility include development of a diverse and highly connected geoinformatics workforce, with a focus on promoting and training women in the field; student and postdoctoral training, within the facility, in data management and curation; education and training of students and faculty across the nation in the importance of data sharing, best practices in data collection and curation, metadata preparation, the processes for making data amenable for reuse and reanalysis by others, and in accessing IEDA and the data systems to which it is linked.

The Interdisciplinary Earth Data Alliance (IEDA) is a unique facility whose operations are based on a partnership of domain-specific data systems that are scientifically linked in their relevance for studies of the solid Earth. These systems share a common repository infrastructure and work together to offer integrated services for data submission and data discovery and access that support interdisciplinary research. IEDA shared services ensure broad discovery and persistent access of data submitted by both individual investigators and data acquisition facilities; and it maintains a close liaison with the science community to ensure that its services are aligned with the practices and requirements of its users. Key components of the IEDA system include repositories for file-based resources; a registry and metadata catalog of geoscience samples; web applications for data submission and sample registration; synthesis datasets for global seafloor topography and geochemistry; user interfaces for text-based and map-based data discovery and access; support for machine clients to submit and access data and metadata; and software for data visualization and exploration. IEDA also provides tools for data management planning and reporting to assist investigators with NSF data management policies. Global seafloor topography and solid-earth geochemical data are processed and synthesized for consistency and completeness, made accessible through web sites and web services, and are included as data layers in IEDA's data exploration software GeoMapApp. IEDA data curators review file-based data submissions for metadata quality to ensure that documentation is sufficient to support future reuse. Long-term archives for IEDA data consist of NOAA's National Center for Environmental Information or the Columbia University Long-Term Archive for data preservation. IEDA uses open-source, standards-based technologies to promote interoperable systems for exchanging data and information and foster next-generation geoscience research.

Agency
National Science Foundation (NSF)
Institute
Division of Ocean Sciences (OCE)
Type
Cooperative Agreement (Coop)
Application #
1636653
Program Officer
Candace Major
Project Start
Project End
Budget Start
2017-05-01
Budget End
2020-06-30
Support Year
Fiscal Year
2016
Total Cost
$7,359,917
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027