An avalanche of sequencing and genomic data has a potential to revolutionize our understanding of microbial biology and transform medical research. Accurate reconstructions of bacterial metabolism often provide the most direct route toward understanding the biology of sequenced species, their growth and environmental properties, and possible interactions with other species in metagenomic communities. Reconstructed metabolic networks can also guide the development of new antibacterial therapeutics. In this application we propose to use the GLOBUS framework to build an integrated, accurate, and fully probabilistic system for the annotation of bacterial metabolism;we will integrate into the framework key data modalities that, according to our preliminary results, will significantly improve the method's coverage and accuracy. Specifically, we will a.) integrate protein structural information into GLOBUS, using analyses of enzyme active sites;b.) extend the GLOBUS framework to accommodate metabolomics data by joint sampling of metabolomics and protein annotations;and c.) integrate into the annotation framework phenotypic information, including growth on multiple nutrient sources measured by the widely used BiOLOG platform for high-throughput bacterial phenotyping. We will implement and make the developed methodology publically available through a transparent web-based portal. This will make it possible for other researchers to use the GLOBUS methodology to annotate any sequenced bacterial species of interest. The portal will be transparent, enabling users to identify the sources of the predictions We will also obtain relevant phenotypic information from our experimental collaborators and use GLOBUS to generate accurate probabilistic annotations for all major bacterial species (~50 bacteria) that are pathogenic to humans.

Public Health Relevance

The GLOBUS methodology developed in the previous funding period represents a major conceptual innovation over currently existing approaches for metabolic annotations. Building on the work performed previously, we propose in the renewal application to significantly expand the developed methodology, add key data modalities (protein structure, metabolomics, phenotypes), and apply the method to all major bacterial species that are pathogenic to humans. Comprehensive phenotypic data (~2000 growth condition per species for ~50 species) will be obtained from our experimental collaborators, and used to significantly improve metabolic annotations for all major bacterial human pathogens.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Modeling and Analysis of Biological Systems Study Section (MABS)
Program Officer
Sledjeski, Darren D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Columbia University (N.Y.)
Internal Medicine/Medicine
Schools of Medicine
New York
United States
Zip Code
Plata, German; Vitkup, Dennis (2014) Genetic robustness and functional evolution of gene duplicates. Nucleic Acids Res 42:2405-14
Hu, Jie; Locasale, Jason W; Bielas, Jason H et al. (2013) Heterogeneity of tumor-induced gene expression changes in the human metabolic network. Nat Biotechnol 31:522-9
Plata, Germán; Gottesman, Max E; Vitkup, Dennis (2010) The rate of the molecular clock and the cost of gratuitous protein synthesis. Genome Biol 11:R98
Hsiao, Tzu-Lin; Revelles, Olga; Chen, Lifeng et al. (2010) Automatic policing of biochemical annotations using genomic correlations. Nat Chem Biol 6:34-40
Chastanet, Arnaud; Vitkup, Dennis; Yuan, Guo-Cheng et al. (2010) Broadly heterogeneous activation of the master regulator for sporulation in Bacillus subtilis. Proc Natl Acad Sci U S A 107:8486-91
de Hoon, Michiel J L; Eichenberger, Patrick; Vitkup, Dennis (2010) Hierarchical evolution of the bacterial sporulation network. Curr Biol 20:R735-45
Feldman, Igor; Rzhetsky, Andrey; Vitkup, Dennis (2008) Network properties of genes harboring inherited disease mutations. Proc Natl Acad Sci U S A 105:4323-8
Fuhrer, Tobias; Chen, Lifeng; Sauer, Uwe et al. (2007) Computational prediction and experimental verification of the gene encoding the NAD+/NADP+-dependent succinate semialdehyde dehydrogenase in Escherichia coli. J Bacteriol 189:8073-8