Genome-scale sequencing, RNA and protein expression studies, and systematic functional characterization projects have provided a foundation for a new microbiology, supported by predictive analysis of gene, RNA and protein sequences and structures using computers. The parallel development of microbial genome science, bioinformatics, Internet 2.0 and desktop supercomputers has helped bring this revolution to academic, industrial and governmental laboratories worldwide. However, the accurate and efficient management of this flood of new data, using expert curatorial oversight to create reliable information systems supporting experimental and systems biology research, has been an ongoing challenge. demonstrates that a high impact, reliable bacterial genome database can be constructed with open source tools and maintained with low overhead in an academic research environment. Two broad, long-term objectives drive development are: (1) the accurate, comprehensive and timely delivery of reliable non-redundant database- unified E. coli K-12 information vetted through an expert gatekeeper and suitable as a foundation for future interdisciplinary research, and (2) the support of other bacterial annotation and research projects through software and methods developed using E. coli as a model system.
The specific aims are: (1) to collect and organize all newly published E. coli research results while controlling the quality of the input data streams, to improve interface functionality, to expand the scope of to include all E. coli strains, and to provide the public with open source code for and its generic derivative;(2) to bring the E. coli K-12 MG1655 Genbank genome information up-to-date on a monthly basis with acting as a curatorial gateway for feature, function and citation update suggestions from partner databases and the public and to expand the datasets and documentation presented in the E. coli K-12 GenBank record;(3) to perform bioinformatics research (a) to document bioinformatics validation and annotation suites for small proteins, sRNAs, pseudogenes and 5'UTRs, and to create standardized test and training sets, and (b) to add diverse statistical algorithms into a combined pattern search tool;(4) to perform selected laboratory verification studies to (a) resolve annotation gaps and ambiguities in proteome verifications including signal peptide/anchor discriminations, lipoproteins, and translation starts (b) verify the genotypes/phenotypes and fill gaps in large mutant collections, and (c) precisely revert mutations acquired during laboratory maintenance to restore lost functions and phenotypes. The accurate annotation of the E. coli genome is necessary in its own right as the most well understood cellular organism. It is also important to facilitate continued research on E. coli K-12 to increase our understanding of the reference strain most critical for anti-bacterial strategies to defend against bacterial bio-terrorism and to protect against the increasing threat of antibiotic-resistant bacterial epidemics.

Public Health Relevance is a web site for scientists interested in the biology of Escherichia coli K-12. Even before genomes began to be sequenced, we knew more about the life of E. coli than any other organism. Now that we have all of the genes for E. coli, we have the possibility of attaining a nearly complete understanding of how a cell works in minute detail. The wealth of information collected at from all over the world about the life of E. coli, both friend and foe of man, should help in achieving this

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Sledjeski, Darren D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Miami School of Medicine
Schools of Medicine
Coral Gables
United States
Zip Code
Peña-Soler, Esther; Fernandez, Francisco J; López-Estepa, Miguel et al. (2014) Structural analysis and mutant growth properties reveal distinctive enzymatic and cellular roles for the three major L-alanine transaminases of Escherichia coli. PLoS One 9:e102139
Zhou, Jindan; Richardson, Andrew J; Rudd, Kenneth E (2013) EcoGene-RefSeq: EcoGene tools applied to the RefSeq prokaryotic genomes. Bioinformatics 29:1917-8
Zhou, Jindan; Rudd, Kenneth E (2013) EcoGene 3.0. Nucleic Acids Res 41:D613-24
Basturea, Georgeta N; Dague, Darryl R; Deutscher, Murray P et al. (2012) YhiQ is RsmJ, the methyltransferase responsible for methylation of G1516 in 16S rRNA of E. coli. J Mol Biol 415:16-21
Zhou, Jindan; Rudd, Kenneth E (2011) Bacterial genome reengineering. Methods Mol Biol 765:3-25
Fozo, Elizabeth M; Kawano, Mitsuoki; Fontaine, Fanette et al. (2008) Repression of small toxic protein synthesis by the Sib and OhsC small RNAs. Mol Microbiol 70:1076-93
Hemm, Matthew R; Paul, Brian J; Schneider, Thomas D et al. (2008) Small membrane proteins found by comparative genomics and ribosome binding site models. Mol Microbiol 70:1487-501
Gonnet, Pedro; Rudd, Kenneth E; Lisacek, Frederique (2004) Fine-tuning the prediction of sequences cleaved by signal peptidase II: a curated set of proven and predicted lipoproteins of Escherichia coli K-12. Proteomics 4:1597-613
Shultzaberger, R K; Bucheimer, R E; Rudd, K E et al. (2001) Anatomy of Escherichia coli ribosome binding sites. J Mol Biol 313:215-28
Sarker, S; Rudd, K E; Oliver, D (2000) Revised translation start site for secM defines an atypical signal peptide that regulates Escherichia coli secA expression. J Bacteriol 182:5592-5