Recent technological advancements have enabled massively parallel analysis of gene expression, metabolite accumulation and protein accumulation. The ability of these "-omics" approaches to generate biological inference faces two limitations. The first is how to combine the different -omics datasets into a single analysis. Secondly, the high number of unknown genes and metabolites limit the potential inference from any given dataset. The primary goal of this project is to test whether natural genetic variation can be used to integrate Metabolomics and Transcriptomics datasets and identify linkages between unknown metabolites and enzymes. Specifically metabolite QTL analysis will be performed using the Arabidopsis Bay-0 X Sha recombinant inbred line population as well as wildtype accessions to develop and test methods for integrating Metabolomics and Transcriptomics analysis using genetic co-variance, rather than metabolite and transcript co-variance, to link these omics-level datasets. All computational approaches will first be developed with known metabolites and enzyme encoding genes and then tested using the unknown metabolites and enzyme encoding genes.
In addition to facilitating the development of approaches towards integrating different omics-level datasets, the use of natural genetic variation will also allow for the testing of the link between genetic variation and phenotypic variation by combining transcriptomics and metabolomics for QTL analysis. Expected outcomes include the generation of techniques that will be directly applicable to any species in which transcriptomics and metabolomics are feasible and a database of plant metabolic variation that will be of general use to all plant biologists and can potentially enhance the rate of QTL identification and validation within a model plant system.
Broader Impacts: Because the approaches used are based on the ability to detect and measure metabolites and transcripts, all statistical methodology and theoretical underpinnings developed in this project will be useful in any biological system where metabolites and transcripts can be measured. In addition, the project has potential broader impact for the biological sciences in general through the study of the link between genetic variation and phenotypic variation, of particular importance in Agriculture, where plant and animal improvement relies upon genetic and phenotypic variation, and Medicine, where numerous diseases are dependent upon genetic variation that alters metabolism.
Finally the project will enable the training of a graduate student in the experimental design, technical details and analysis required to obtain and interrogate large metabolomic datasets. These skills will be integral for the success of post-genomic researchers attempting to generate and test hypotheses through the use of large-scale genomic datasets.