The completion of the genome sequence for an organism such as a disease-causing bacterium marks the start of a phase of accelerated research that can now lead to the production of a quantitative metabolic flux model for the organism. Such metabolic flux models have diverse applications. They enable basic scientific understanding of biochemical pathways. They allow computational predictions of which genes and reactions within the metabolic network are essential, thus facilitating the design of drugs against disease-causing bacteria. They support the engineering of new metabolic pathways, such as for bioenergy research. The huge quantity of data and knowledge defined by the genome sequence, and by the large number of past and future experimental findings for the organism, requires using a database as a central repository for information about the genome, the biochemical network, and the regulatory processes of that organism. The Pathway Tools software is a robust and comprehensive system for constructing organism-specific databases, and for pathway analysis of genomes. Pathway Tools enables communities of scientists to create, query, visualize, analyze MODs, and to publish the MODs on the web. Pathway Tools supports construction of MODs that combine a large number of bioinformatics datatypes, including genome maps, genes, operons, RNAs, proteins, chemical compounds, biochemical reactions, metabolic pathways, and regulatory interactions. Pathway Tools is a mature and production grade software environment that has been used by 264 groups outside SRI (see Appendix) to analyze 1,969 genomes. We propose major improvements to the new Pathway Tools module for producing metabolic flux models for individual organisms and for microbial communities. This tool allows much faster creation of these complex models than does comparable tools. We propose to perform a major redesign of the Pathway Tools web interfaces. We propose to enhance the Pathway Tools web metabolic network diagram and regulatory network diagram, which provide interactive, web-based cellular network maps. We propose to create a change-reporting system for notifying scientists of updates to Pathway Tools databases. We propose to create a tool for inferring multimeric protein complexes in sequenced genomes, and to extend Pathway Tools with several sequence-related operations. We propose to provide support services for the large and growing user community for Pathway Tools, to maintain quality documentation for the software, and to create two thoroughly tested releases of the software per year.

Public Health Relevance

The proposed project will generate software tools that increase the ability of scientists to use genome sequence data to advance human health. This software enables creating central database repositories and websites con- training genome data and data on cellular networks for disease-causing bacteria, and for experimental model organisms such as the laboratory mouse. The software allows researchers to access the most up-to-date information about those organisms using intuitive querying tools, and provides scientific visualization capabilities that aid scientists in more quickly understanding large, complex collections of data, and applying that data to advance health.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Sledjeski, Darren D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Sri International
Menlo Park
United States
Zip Code
Latendresse, Mario (2014) Efficiently gap-filling reaction networks. BMC Bioinformatics 15:225
Caspi, Ron; Altman, Tomer; Billington, Richard et al. (2014) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of Pathway/Genome Databases. Nucleic Acids Res 42:D459-71
Karp, Peter D; Paley, Suzanne; Altman, Tomer (2013) Data mining in the MetaCyc family of pathway databases. Methods Mol Biol 939:183-200
Caspi, Ron; Dreher, Kate; Karp, Peter D (2013) The challenge of constructing, classifying, and representing metabolic pathways. FEMS Microbiol Lett 345:85-93
Travers, Michael; Paley, Suzanne M; Shrager, Jeff et al. (2013) Groups: knowledge spreadsheets for symbolic biocomputing. Database (Oxford) 2013:bat061
Latendresse, Mario; Paley, Suzanne; Karp, Peter D (2012) Browsing metabolic and regulatory networks with BioCyc. Methods Mol Biol 804:197-216
Caspi, Ron; Altman, Tomer; Dreher, Kate et al. (2012) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 40:D742-53
Karp, Peter D; Caspi, Ron (2011) A survey of metabolic databases emphasizing the MetaCyc family. Arch Toxicol 85:1015-33
Latendresse, Mario; Karp, Peter D (2011) Web-based metabolic network visualization with a zooming user interface. BMC Bioinformatics 12:176
Caspi, Ron; Altman, Tomer; Dale, Joseph M et al. (2010) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 38:D473-9

Showing the most recent 10 out of 19 publications