Development of tools to analyze and integrate the avalanche of heterogeneous, multi-omic data (e.g. genomics, transcriptomics, ChIP-seq, and proteomics) is a major NIH priority. These tools are necessary in order to transform huge datasets into the knowledge that will enable prevention, detection, and treatment of disease. Unfortunately, multi-omics analysis is exponentially more challenging than single-ome analysis, and vast majority of labs in the biological community are not equipped or capable of performing. We propose to develop and test an open-source workflow platform to facilitate multi-omic data analysis: The WINGS MultiOmic Discovery Engine (WINGS-MDE) will extend the capabilities of WINGS, an open-source semantic workflow system developed by Dr. Gil. Our innovative approach includes four key features. Ease of use - users interact with a simple, cloud-based web interface for workflow development and execution). Intelligence - a semantic workflow reasoner will significantly automate development, validate new or altered workflows and perform meta-analyses that will trigger a researcher's workflows on new data and alert them of interesting findings. Flexibility ? an adaptable plug-in architecture allows the alteration of parameters, the addition of algorithms and the assessment of incremental changes in workflow. Designed for multi-omics ? WINGS- MDE will support diverse, multi-omic workflows of broad interest. WINGS-MDE will also contain an execution engine able to execute over distributed resources and manage data at large scale. In addition, we will develop a provenance repository to capture how data were analyzed to facilitate reuse and reproducibility.
Our Specific Aims are: to (1) Create a repository of semantic workflows for performing multi-omic analysis which will contain the most common multi-omic analysis components and enable their use and reuse; (2) Develop a multi-omic discovery meta-workflow engine which will enable researchers to compare workflows and establish how sensitive results re to particular aspects of a workflow and (3) Develop an inter-lab workflow sharing environment which will support the enhanced annotation and dissemination of public datasets. This work will enable diverse researchers to develop and perform multi-omic analysis in a rigorous, reproducible, and shareable manner. We anticipate that the analytical methods produced through this project will improve the ease-of-use, transparency, reproducibility, and testability of multi-omic analysis improving their impact in understanding disease biology and treatment.
Tools to analyze and integrate the growing avalanche of voluminous and heterogeneous, multi-omic data are expected to enable new ways to prevent, detect and treat disease. Currently the analysis of this data is challenging, consequently important disease mechanisms are going on discovered.