Research frequently uses material samples as a basic element for reference, study, and experimentation in many scientific disciplines, especially in the natural and environmental sciences, material sciences, agriculture, physical anthropology, archaeology, and biomedicine. Observations made on samples collected in the field and in the laboratory constitute a critical data resource for research that addresses grand challenges of our planet's future sustainability, from environmental change; to food, energy, and water resources; to natural hazards and their mitigation; to public health. The large investments of public funds being made to curate huge volumes of samples acquired over decades or even centuries, and to collect and analyze new samples, demand that these samples be openly accessible, easily discoverable, and documented with sufficient information to make them reusable. The current ecosystem of sample and sample data management in the U.S. and globally is highly fragmented across stakeholders, including museums, federal agencies, academic institutions, and individual researchers, with a multitude of institutional and discipline-specific catalogs, practices for sample identification, and protocols for describing samples. The iSamples project is a multi-disciplinary collaboration that will develop a national digital infrastructure to provide services for globally unique, consistent, and convenient identification of material samples; metadata about them; and linking them to other samples, derived data, and research results published in the literature. iSamples builds on previous initiatives to achieve these goals by providing material samples with globally unique, persistent identifiers that reliably link to landing pages with metadata describing the sample and its provenance, and which allow unambiguously linking samples with data and publications. Leveraging significant national investments, iSamples provides the missing link among (i) physical collections (e.g., natural history museums, herbaria, biobanks), (ii) field stations, marine laboratories, long-term ecological research sites, and observatories, and (iii) data repositories and cyberinfrastructure. iSamples delivers enhanced infrastructure for STEM research and education, decision-makers, and the general public. iSamples benefits national security and resource management by offering a means to assure sample provenance, improving scientific reproducibility and demonstrating compliance with ethical standards, national regulations, and international treaties.

The Internet of Samples (iSamples) is a multi-disciplinary and multi-institutional project to design, develop, and promote service infrastructure to uniquely, consistently, and conveniently identify material samples, record metadata about them, and persistently link them to other samples and derived digital content, including images, data, and publications. The project will create a flexible and scalable architecture to ensure broad adoption and implementation by diverse stakeholders. iSamples will build upon existing identifier infrastructure such as IGSNs (Global Sample Number;) and ARKs (Archival Resource Keys), but is agnostic to identifier type. Likewise, iSamples will encourage a high-level metadata standard for natural history samples (across biosciences, geosciences, and archaeology), while supporting community-developed metadata standards in specialist domains. Through integration with established discipline-specific infrastructure at the System for Earth Sample Registration SESAR (geoscience), CyVerse (bioscience), and Open Context (archaeology), iSamples will extend existing capabilities, enhance consistency, and expand their reach to serve science and society much more broadly. The project includes three main objectives: 1) Design and develop iSamples infrastructure (iSamples in a Box and iSamples Central); 2) Build four initial implementations of iSamples for adoption and use case testing (Open Context, GEOME, SESAR, and Smithsonian Institution); and 3) Conduct outreach and community engagement to developers, individual researchers, and international organizations concerned with material samples. The project will follow an agile development process that includes community engagement as an important element of creating software requirements and an implementation timeline.

This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

Agency
National Science Foundation (NSF)
Institute
Division of Advanced CyberInfrastructure (ACI)
Type
Standard Grant (Standard)
Application #
2004839
Program Officer
Alan Sussman
Project Start
Project End
Budget Start
2020-08-15
Budget End
2024-07-31
Support Year
Fiscal Year
2020
Total Cost
$1,277,628
Indirect Cost
Name
Columbia University
Department
Type
DUNS #
City
New York
State
NY
Country
United States
Zip Code
10027