Development of tools to analyze and integrate the avalanche of heterogeneous, multi-omic data (e.g. genomics, transcriptomics, ChIP-seq, and proteomics) is a major NIH priority. These tools are necessary in order to transform huge datasets into the knowledge that will enable prevention, detection, and treatment of disease. Unfortunately, multi-omics analysis is exponentially more challenging than single-ome analysis, and vast majority of labs in the biological community are not equipped or capable of performing. We propose to develop and test an open-source workflow platform to facilitate multi-omic data analysis: The WINGS MultiOmic Discovery Engine (WINGS-MDE) will extend the capabilities of WINGS, an open-source semantic workflow system developed by Dr. Gil. Our innovative approach includes four key features. Ease of use - users interact with a simple, cloud-based web interface for workflow development and execution). Intelligence - a semantic workflow reasoner will significantly automate development, validate new or altered workflows and perform meta-analyses that will trigger a researcher's workflows on new data and alert them of interesting findings. Flexibility ? an adaptable plug-in architecture allows the alteration of parameters, the addition of algorithms and the assessment of incremental changes in workflow. Designed for multi-omics ? WINGS- MDE will support diverse, multi-omic workflows of broad interest. WINGS-MDE will also contain an execution engine able to execute over distributed resources and manage data at large scale. In addition, we will develop a provenance repository to capture how data were analyzed to facilitate reuse and reproducibility.
Our Specific Aims are: to (1) Create a repository of semantic workflows for performing multi-omic analysis which will contain the most common multi-omic analysis components and enable their use and reuse; (2) Develop a multi-omic discovery meta-workflow engine which will enable researchers to compare workflows and establish how sensitive results re to particular aspects of a workflow and (3) Develop an inter-lab workflow sharing environment which will support the enhanced annotation and dissemination of public datasets. This work will enable diverse researchers to develop and perform multi-omic analysis in a rigorous, reproducible, and shareable manner. We anticipate that the analytical methods produced through this project will improve the ease-of-use, transparency, reproducibility, and testability of multi-omic analysis improving their impact in understanding disease biology and treatment.

Public Health Relevance

Tools to analyze and integrate the growing avalanche of voluminous and heterogeneous, multi-omic data are expected to enable new ways to prevent, detect and treat disease. Currently the analysis of this data is challenging, consequently important disease mechanisms are going on discovered.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM117097-03
Application #
9399670
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Ravichandran, Veerasamy
Project Start
2016-01-01
Project End
2018-12-31
Budget Start
2018-01-01
Budget End
2018-12-31
Support Year
3
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Stanford University
Department
Radiation-Diagnostic/Oncology
Type
Schools of Medicine
DUNS #
009214214
City
Stanford
State
CA
Country
United States
Zip Code
94304