SRI International and a group of collaborators propose to further develop the EcoCyc database (DB). EcoCyc is a freely and openly available model-organism DB for the bacterium Escherichia coli K-12 that is accessible to scientists through the World Wide Web, and may be downloaded for local installation. EcoCyc is heavily used by scientists from multiple disciplines. It serves as a general reference source on E. coli for experimental biologists, and is particularly useful for the analysis of functional-genomics experiments. The DB serves computational biologists who undertake global studies of E. coli, metabolic engineers developing new methods for chemicals production including biofuels researchers, and bioinformaticists who use EcoCyc as a gold-standard dataset to develop new computational methods. The DB is also used by educators. We propose to continue work to update EcoCyc in an ongoing fashion to reflect new information about E. coli genes, metabolic pathways, and regulatory interactions. Information will be fused from the biomedical literature and from high-throughput experiments. New types of data to be integrated include data on gene essentiality, on conditions of growth and non-growth for E. coli, experimental evidence for protein existence, protein localization information, and enzyme kinetics data. We will also undertake a comprehensive and ongoing effort to computationally validate the metabolic network model within EcoCyc by validating its predictions against many conditions of growth and non-growth for wildtype and knock-out strains of E. coli. We will develop bioinformatics methods for hypothesizing changes to the EcoCyc metabolic model that brings its predictions into closer concordance with large amounts of experimental data. The project will also expand the Pathway Tools software used to query and analyze EcoCyc, such as by modernizing elements of its graphical interface, adding sequence-related operations, and enabling retrieval of omics datasets from the EcoliHub resource. The project will develop educational materials based on EcoCyc for graduate and undergraduate education.

Public Health Relevance

E. coli is the best-studied bacterium on earth;therefore, an electronic knowledge base that integrates experimental findings on E. coli from thousands of literature articles is a valuable resource for science and education. The comprehensive knowledge and computational tools available through EcoCyc accelerate the research of scientists who study E. coli and use E. coli as a system for developing biofuels, as well as scientists who study related organisms such as drug-resistant bacteria, and those who study bacteria that are the subjects of biodefense research.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Sledjeski, Darren D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Sri International
Menlo Park
United States
Zip Code
Caspi, Ron; Billington, Richard; Fulcher, Carol A et al. (2018) The MetaCyc database of metabolic pathways and enzymes. Nucleic Acids Res 46:D633-D639
Karp, Peter D; Billington, Richard; Caspi, Ron et al. (2017) The BioCyc collection of microbial genomes and metabolic pathways. Brief Bioinform :
Keseler, Ingrid M; Mackie, Amanda; Santos-Zavaleta, Alberto et al. (2017) The EcoCyc database: reflecting new knowledge about Escherichia coli K-12. Nucleic Acids Res 45:D543-D550
Karp, Peter D; Latendresse, Mario; Paley, Suzanne M et al. (2016) Pathway Tools version 19.0 update: software for pathway/genome informatics and systems biology. Brief Bioinform 17:877-90
Karp, Peter D (2016) Crowd-sourcing and author submission as alternatives to professional curation. Database (Oxford) 2016:
Karp, Peter D (2016) How much does curation cost? Database (Oxford) 2016:
Gama-Castro, Socorro; Salgado, Heladia; Santos-Zavaleta, Alberto et al. (2016) RegulonDB version 9.0: high-level integration of gene regulation, coexpression, motif clustering and beyond. Nucleic Acids Res 44:D133-43
Caspi, Ron; Billington, Richard; Ferrer, Luciana et al. (2016) The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases. Nucleic Acids Res 44:D471-80
Karp, Peter D (2016) Can we replace curation with information extraction software? Database (Oxford) 2016:
Weaver, Daniel S; Keseler, Ingrid M; Mackie, Amanda et al. (2014) A genome-scale metabolic flux model of Escherichia coli K-12 derived from the EcoCyc database. BMC Syst Biol 8:79

Showing the most recent 10 out of 28 publications