The functions of over 1/3 of the annotated protein-coding genes of the Arabidopsis genome are still unknown, and the annotation of an even larger portion of the genome is not sufficiently accurate for unambiguous assignment of function at the biochemical and physiological levels. This project will bring together a consortium of multidisciplinary collaborators to establish pipelines for generating metabolomics data-streams and to provide statistical and computational interpretation of the resulting integrated datasets. The goal is to develop metabolite-profiling capabilities that will enhance the research community's ability to formulate testable hypotheses concerning Arabidopsis gene functions. The consortium has developed metabolomic platforms that together detect approximately 1,800 metabolites, of which 900 are chemically defined. The aim of the project is to apply these established metabolite-profiling platforms to reveal changes in the metabolome associated with knockout mutations in up to 200 Arabidopsis genes of unknown function and compare these to similar mutants in 50 genes of known function. The consortium will disseminate these data via the existing multi-functional metabolomics database: www.plantmetabolomics.org. Enhancement of this database and associated statistical and visualization toolsets will enable researchers to formulate testable computational models of the metabolic network of Arabidopsis. The successful completion of these goals and integration with other NSF-sponsored functional genomics and cyber infrastructure developments will generate transformational resources for ultimately modeling the complex metabolism of Arabidopsis.

Broader Impacts The project will develop new resources for the research community that will enhance the capability to globally profile genome expression at the metabolite level. These metabolite resources, in collaboration with other NSF-funded resource development projects, will enable researchers in the community to formulate credible, testable hypotheses concerning gene function. The project will foster the development of the science of metabolomics as a functional genomics tool through workshops, internships and organization of national and international meetings. The project will also develop new activities to enhance the impact of science education and training in the community, by conducting workshops for researchers at consortium labs and at international biological meetings. In addition, research internships will be offered to undergraduate students, eight of whom will have the opportunity to experience international science training in a European genomics laboratory. These research-based training internships will illustrate to the students the synergy that accompanies the integrated applications of chemistry, biochemistry, genetics, bioinformatics and computational sciences to solving complex biological problems.

Project Report

One of the major challenges of high-throughput biological research based on genomics is the fact that the vast majority of the genes that are being cataloged by genome sequencing projects are annotated based upon sequence homology. Because experimentally based annotation is a slower process than genome sequencing, the majority of homology-based annotations are not based on experimentally verified data. This multi-institutional, multi-disciplinary project tested the utility of integrating metabolomics in discovering the function of genes whose function(s) are unknown. By integrating genomics, genetics with metabolomics, the underlying hypothesis of the project was that metabolomics data would be highly revealing of accurate gene functions. The project, through its construction of an integrated multi-institutional analytic platform demonstrated this capability by compiling metabolite ID and abundance data of nearly 1800 metabolites across 250 mutant lines of Arabidopsis. In addition to the actual data, the project established standardized protocols for generating such comprehensive and accurate metabolomics data, and associated metadata, and distributing these through newly developed, online databases, www.plantmetabolomics.org and http://metnetdb.org/PMR/. The success of the project is evidenced by the fact that metabolomics is now a major thrust in many functional genomics research projects, and some of these projects are depositing their metabolomics data within this project’s PMR database. Furthermore, this project was precedence setting and led to a number of institutions to invest in facilities and faculty, which has therefore expanded this country’s metabolomics research capabilities. Moreover, the broader, long-term impact of the project is through the training of undergraduate, graduate and post-doctoral students in a multi-disciplinary setting that is at the interface between biochemistry, analytical chemistry, genetics and genomics, and biostatistics and computational sciences.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Application #
0820823
Program Officer
Kamal Shukla
Project Start
Project End
Budget Start
2009-03-01
Budget End
2014-02-28
Support Year
Fiscal Year
2008
Total Cost
$2,925,398
Indirect Cost
Name
Iowa State University
Department
Type
DUNS #
City
Ames
State
IA
Country
United States
Zip Code
50011