Key Collaborator: Peter Karp (SRI International)

Innovations in the basic understanding of plants and effective application of that knowledge in the field are essential to meet energy and food security challenges in the 21st century. Rapid advances in sequencing technologies will increase the number of sequenced plant genomes exponentially. There is a growing need to place the sequenced genomes in a biochemical context in order to facilitate enzyme discovery, metabolic-trait based breeding and metabolic engineering. Meeting the ever-expanding demand for food, biofuel and phytopharmaceutical production will require a comprehensive and accurate understanding of the enzymes, pathways and regulatory networks that control metabolism in plants. This project will develop an automated pipeline to reconstruct high-quality plant metabolic pathways from genome-scale sequence and functional genomics data. This pipeline will be used to create freely accessible metabolic pathway databases for 18 agriculturally and industrially important plants. Further value will be added to these species-specific databases and the multi-species PlantCyc reference database through literature-based curation of enzymes and pathways. In addition, new experimentally supported pathway regulation and metabolic flux data will be added to these databases to guide rational metabolic engineering. The computational pipelines, reference and species-specific databases and plant-specific "gold-standard" sets of rate-limiting enzymatic reactions and transcriptional regulatory relationships will facilitate systematic characterization of plant metabolism and metabolic engineering.

The visibility, utility and potential impact of the resources generated by this project will be increased by engaging expert scientists working on both commercially valuable and socially valuable "orphan crops" in the pathway validation and curation process. Over 7 researchers world-wide have committed to participating in reconstructing pathway databases for their species of interest. The project will also train several undergraduate interns from a local community college to create a valuable species-tagged plant compound data resource. The interns will increase their biological knowledge and familiarity with a large number of valuable on-line resources. Furthermore, their exposure to scientists at the Carnegie Institution may prompt them to pursue a career in biological research. The project also plans to stimulate enthusiasm for plant metabolism, especially among under-represented groups of high school students. The project will collaborate with a high school teacher at a public charter school to produce a teaching module on plant metabolism that connects to state and national biology education standards. It will be presented at several schools serving under-represented minority and low-income students. It will also be made freely available at the project website and actively disseminated to high school biology teachers. All the data, tools and teaching materials produced from this project will be publicly and freely accessible from the project website (http://plantcyc.org), through complete database downloads under a Creative Commons "share-alike" license (http://creativecommons.org/licenses/bysa/3.0/), and as bulk data sets as downloadable text files.

Agency
National Science Foundation (NSF)
Institute
Division of Integrative Organismal Systems (IOS)
Application #
1026003
Program Officer
thomas okita
Project Start
Project End
Budget Start
2010-09-01
Budget End
2016-08-31
Support Year
Fiscal Year
2010
Total Cost
$1,821,570
Indirect Cost
Name
Carnegie Institution of Washington
Department
Type
DUNS #
City
Washington
State
DC
Country
United States
Zip Code
20005