The project objective is to develop a bioinformatics foundation for deciphering the metabolic network of every organism with a fully sequenced genome, in support of drug discovery, metabolic engineering, systems biology, and basic science. Our approach is based on a gold-standard metabolic database, MetaCyc, which is curated by Ph.D.- level biologists, from the experimental literature. A second objective is to further develop BioCyc, an evolving collection of Pathway/Genome Databases for 5,000-10,000 sequenced prokaryotic genomes. BioCyc will become the premier source of prokaryotic genome data because of its planned comprehensive coverage of prokaryotic genomes;its integration of multiple information sources;its powerful and user-friendly bioinformatics search, visualization, and analysis tools;and its distribution of data via multiple access channels. We have four specific aims. (1) To expand MetaCyc, a highly curated multi-organism database of metabolic pathways and enzymes that serves as an encyclopedic reference of metabolic information. MetaCyc can be used to predict the metabolic pathway complement of an organism from its sequenced genome. Information about experimentally determined metabolic pathways and enzymes will be curated into MetaCyc from the biomedical literature, with a focus on prokaryotic, fungal, and plant information. (2) To computationally generate BioCyc, a collection of organism-specific Pathway/Genome Databases for completely sequenced prokaryotes and model organisms that includes predicted metabolic pathways, predicted metabolic pathway hole fillers, and predicted operons. (3) To enhance the Pathway Tools software that supports the querying, visualization, and analysis of MetaCyc and BioCyc to include new comparative genomics capabilities;to include genome-context-based predic- tions of functionally related proteins and of novel pathways;to include a tool for iteratively browsing the reaction neighborhood of a metabolite;and to provide textual searches against multiple BioCyc databases. (4) To make MetaCyc and BioCyc available to the scientific community through a Web portal and via downloadable data files and software.

Public Health Relevance

This project will create a powerful and user-friendly Web portal containing thousands of bacterial genomes, to- gether with the biochemical pathways encoded by each genome. By characterizing the metabolic pathways of thousands of organisms, this project will facilitate alterations to those pathways by metabolic engineering, such as to allow bacteria to synthesize drugs, and it will speed the development of drugs that kill disease-causing bacteria by enabling identification of essential metabolic pathways for disruption.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM080746-08
Application #
8665437
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Gregurick, Susan
Project Start
2007-06-19
Project End
2015-05-31
Budget Start
2014-06-01
Budget End
2015-05-31
Support Year
8
Fiscal Year
2014
Total Cost
$1,215,153
Indirect Cost
$431,739
Name
Sri International
Department
Type
DUNS #
009232752
City
Menlo Park
State
CA
Country
United States
Zip Code
94025
Caspi, Ron; Altman, Tomer; Billington, Richard et al. (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42:D459-71
Karp, Peter D; Paley, Suzanne; Altman, Tomer (2013) Data mining in the MetaCyc family of pathway databases. Methods Mol Biol 939:183-200
Caspi, Ron; Dreher, Kate; Karp, Peter D (2013) The challenge of constructing, classifying, and representing metabolic pathways. FEMS Microbiol Lett 345:85-93
Altman, Tomer; Travers, Michael; Kothari, Anamika et al. (2013) A systematic comparison of the MetaCyc and KEGG pathway databases. BMC Bioinformatics 14:112
Latendresse, Mario; Krummenacker, Markus; Trupp, Miles et al. (2012) Construction and completion of flux balance models from pathway databases. Bioinformatics 28:388-96
Caspi, Ron; Altman, Tomer; Dreher, Kate et al. (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 40:D742-53
Karp, Peter D; Caspi, Ron (2011) A survey of metabolic databases emphasizing the MetaCyc family. Arch Toxicol 85:1015-33
Karp, Peter D; Paley, Suzanne M; Krummenacker, Markus et al. (2010) Pathway Tools version 13.0: integrated software for pathway/genome informatics and systems biology. Brief Bioinform 11:40-79
Caspi, Ron; Altman, Tomer; Dale, Joseph M et al. (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38:D473-9
Caspi, Ron; Foerster, Hartmut; Fulcher, Carol A et al. (2008) The MetaCyc Database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 36:D623-31