The primary objective of this proposal is the establishment of a robust, cloud-based National Metabolomics Data Repository founded on the FAIR principle, i.e. findable, accessible, interoperable, and reusable with scalability, extensibility, and portability. Over the past six years, we have developed the first version of such a repository, known as the Metabolomics Workbench (MW), and our overarching effort through this proposal will be to develop the next generation MW as follows: - Support a state-of-the-art database infrastructure based on an open-access public domain relational database, PostGreSQL, which will house all metadata and data pertaining to metabolomics. - Establish community-acceptable metadata and data standards and develop newer standards, where lacking, with the aid of the expert community and other stakeholders. - Provide multiple easy-user interfaces with multiple format options for researchers to enter all metadata and data associated with a metabolomics study. - Develop interfaces and tools for easy access to querying and analyzing the data along with Application Programmer Interfaces (APIs) to add/extend existing tools and interfaces. - Generate, in consultation with the SEPCC, best practice protocols and APIs for tool integration. - Coordinate with the Metabolomics Steering Committee, the Governing Board, the SEPCC, and other stakeholders to ensure that the broad goals of the NIH Metabolomics Consortium are achieved. - Formulate mechanisms for very large community participation in burgeoning metabolomics resources and communicate to the larger biomedical community the value of using metabolomics data and tools for research. The proposed NMDR will have 4 cores, namely a) the Admin Core, whose responsibility will be the overall coordination, including the administration of all the NMDR cores and establishment of guidelines for coordination with all other stakeholders; b) the Data Repository Core, which will house all metabolomics data, provide a large suite of tools and provide interfaces for querying, analyzing, and displaying the data on a web portal interface known as the nextgen MW, and encourage the broader community to deposit data; c) the Governance Core, which will, along with the SEPCC, establish a body of eight experts (four recommended by the PI) to provide deep guidance to the NMDR for choice of formats, protocols, and tools; and d) an additional core, called the Data Services Core, which will be the harbinger for long-term stability. This forward-looking Core will equip the NMDR for the future by porting the nextgen MW into a hybrid cloud environment and developing a community-supported container technology for cloud-based metabolomics analysis tools.

Public Health Relevance

Over the past decade, the exponential increase of research in metabolomics has created an imminent need for a robust data repository that can help the scientific community find, access, interoperate, and reuse the data generated. The National Metabolomics Data Repository is designed to serve this purpose with core components including stored data, interfaces to upload, download, query, display, and analyze data, and a suite of tools to interactively analyze the data. Further, the NMDR will be accessible through an open cloud environment.

National Institute of Health (NIH)
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Resource-Related Research Multi-Component Projects and Centers Cooperative Agreements (U2C)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Castle, Arthur
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California, San Diego
Engineering (All Types)
Schools of Arts and Sciences
La Jolla
United States
Zip Code