SRI International and a group of collaborators propose to further develop the EcoCyc database (DB). EcoCyc is a freely and openly available model-organism DB for the bacterium Escherichia coli K-12 that is accessible to scientists through the World Wide Web, and may be downloaded for local installation. EcoCyc is heavily used by scientists from multiple disciplines. It serves as a general reference source on E. coli for experimental biologists, and is particularly useful for the analysis of functional-genomics experiments. The DB serves computational biologists who undertake global studies of E. coli, metabolic engineers developing new methods for chemicals production including biofuels researchers, and bioinformaticists who use EcoCyc as a gold-standard dataset to develop new computational methods. The DB is also used by educators. We propose to continue work to update EcoCyc in an ongoing fashion to reflect new information about E. coli genes, metabolic pathways, and regulatory interactions. Information will be fused from the biomedical literature and from high-throughput experiments. New types of data to be integrated include data on gene essentiality, on conditions of growth and non-growth for E. coli, experimental evidence for protein existence, protein localization information, and enzyme kinetics data. We will also undertake a comprehensive and ongoing effort to computationally validate the metabolic network model within EcoCyc by validating its predictions against many conditions of growth and non-growth for wildtype and knock-out strains of E. coli. We will develop bioinformatics methods for hypothesizing changes to the EcoCyc metabolic model that brings its predictions into closer concordance with large amounts of experimental data. The project will also expand the Pathway Tools software used to query and analyze EcoCyc, such as by modernizing elements of its graphical interface, adding sequence-related operations, and enabling retrieval of omics datasets from the EcoliHub resource. The project will develop educational materials based on EcoCyc for graduate and undergraduate education.

Public Health Relevance

E. coli is the best-studied bacterium on earth;therefore, an electronic knowledge base that integrates experimental findings on E. coli from thousands of literature articles is a valuable resource for science and education. The comprehensive knowledge and computational tools available through EcoCyc accelerate the research of scientists who study E. coli and use E. coli as a system for developing biofuels, as well as scientists who study related organisms such as drug-resistant bacteria, and those who study bacteria that are the subjects of biodefense research.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Resource-Related Research Projects--Cooperative Agreements (U24)
Project #
5U24GM077678-22
Application #
8322072
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Sledjeski, Darren D
Project Start
1992-08-15
Project End
2014-06-30
Budget Start
2012-07-01
Budget End
2013-06-30
Support Year
22
Fiscal Year
2012
Total Cost
$1,177,912
Indirect Cost
$382,119
Name
Sri International
Department
Type
DUNS #
009232752
City
Menlo Park
State
CA
Country
United States
Zip Code
94025
Keseler, Ingrid M; Skrzypek, Marek; Weerasinghe, Deepika et al. (2014) Curation accuracy of model organism databases. Database (Oxford) 2014:
Mackie, Amanda; Paley, Suzanne; Keseler, Ingrid M et al. (2014) Addition of Escherichia coli K-12 growth observation and gene essentiality data to the EcoCyc database. J Bacteriol 196:982-8
Weaver, Daniel S; Keseler, Ingrid M; Mackie, Amanda et al. (2014) A genome-scale metabolic flux model of Escherichia coli K-12 derived from the EcoCyc database. BMC Syst Biol 8:79
Karp, Peter D; Weaver, Daniel; Paley, Suzanne et al. (2014) The EcoCyc Database. EcoSal Plus 2014:
Keseler, Ingrid M; Mackie, Amanda; Peralta-Gil, Martin et al. (2013) EcoCyc: fusing model organism databases with systems biology. Nucleic Acids Res 41:D605-12
Eker, Steven; Krummenacker, Markus; Shearer, Alexander G et al. (2013) Computing minimal nutrient sets from metabolic networks via linear constraint solving. BMC Bioinformatics 14:114
Travers, Michael; Paley, Suzanne M; Shrager, Jeff et al. (2013) Groups: knowledge spreadsheets for symbolic biocomputing. Database (Oxford) 2013:bat061
Latendresse, Mario; Krummenacker, Markus; Trupp, Miles et al. (2012) Construction and completion of flux balance models from pathway databases. Bioinformatics 28:388-96
Keseler, Ingrid M; Collado-Vides, Julio; Santos-Zavaleta, Alberto et al. (2011) EcoCyc: a comprehensive database of Escherichia coli biology. Nucleic Acids Res 39:D583-90
Keseler, Ingrid M; Bonavides-Martinez, Cesar; Collado-Vides, Julio et al. (2009) EcoCyc: a comprehensive view of Escherichia coli biology. Nucleic Acids Res 37:D464-70

Showing the most recent 10 out of 11 publications