There is a growing interest by the medical genomics community in using metabolomics data to complement genomic (DNA) and gene expression (RNA) studies. The goal is often to understand the molecular processes of disease in order to facilitate identifying novel therapeutic approaches as well as using molecular profiling as an aid in treatment decisions. Methods for metabolomics data analysis are typically not accessible to clinicians and and there is a pressing need for the development of strategies for integration of both data types in biomedical research. Both Conesa and McIntyre have developed software tools to make metabolomics data analysis and interpretation easier for scientists with a genetics background. McIntyre developed a Galaxy module for the analysis of metabolomics data that identifies differentially expressed metabolites. Conesa created the PaintOmics tool, a web-based resource to jointly visualize metabolomics and genomics data over the template of KEGG pathways. However a fully integrated analysis platform is still missing. In this R03 we will join the previous developments from both groups to create a platform for the integrative analysis of genomics and metabolomics data based on the Galaxy environment.
In Aim 1, and based on existing solutions from PaintOmics, we will develop a module to import KEGG into Galaxy and map lists of significant differentially expressed genes and metabolites onto the KEGG pathways. A full re-implementation of the PaintOmics Java code into Phyton scripts will be needed.
In Aim 2 we will develop new statistical methods to for integrative pathway analysis using genomics and metabolomics. Will use the KEGG topology to identify subgraphs enriched for significant features of both omics by analyzing the probability for a gene being differentially expressed condition to the differential expression of a neighboring metabolite. We will also adapt previous developments in the McIntyre lab that infer genetic interaction networks to predict additions of unidentified significant metabolites into the metabolic networks. By using the biologist-friendly Galaxy platform we expect to make metabolomics-genomics integration more accessible to clinicians, help the biomedical community to understand the relationship between gene expression and metabolite changes in relation to disease and contribute to the development of new clinical insights that lead to novel therapies and/or diagnostics.
The combination of metabolomics and transcriptomics information is of interest in biomedical research. We propose to develop a user-friendly and clinically oriented bioinformatics solution for integrative analysis of metabolomics and transcriptomics data based on the Galaxy platform. This solution will contain modules to import the KEGG database into Galaxy, map user?s metabolomics and transcriptomics data into KEGG and a new method to perform integrative pathway analysis using both technologies.