Computer models of microbial metabolism have proven useful for understanding the biochemical processes by which organisms transform substrates into biomass components and energy. These computer models consist in part of a network representing the biochemical reactions catalyzed by the enzymes encoded in an organism's genome. The challenge of metabolic reconstruction is to create a complete reaction network suitable for systems level analysis directly from an annotated genome. Metabolic reconstruction is still largely a manual process, and to date metabolic reconstructions have been published for only a handful of organisms. Thus, there is an urgent need for more substantial automation of this process. The research will take a breadth-first approach to generating substantially complete metabolic reconstructions for all sequenced microbial genomes. This approach involves an iterative process of identifying areas of metabolism encoded in microbial genomes that are not yet represented in the database, creating new reaction network components for them, and generating updated metabolic reconstructions. Each of these metabolic reconstructions will require some degree of manual refinement to reach completion. Therefore, software tools will be developed and validated by producing complete metabolic reconstructions for a small set of selected organisms. A web site will be created to make available to the scientific community all of the metabolic reconstructions and refinement tools. The database of metabolic components and software will be fully integrated with the SEED, a widely used comparative genome annotation and analysis environment, and the principal investigators will collaborate with the SEED community to disseminate the research results. The database and software developed will significantly reduce the amount of manual effort required to create complete metabolic reconstructions that are suitable for systems level analysis. This will provide high-quality starting points for modeling efforts by organism-specific scientific communities. Metabolic reconstructions for diverse microbes will lead to new scientific investigations into their phylogenetic relationships as well as new methods for analyzing genomic, metabolomic and proteomic data.
Broader Impacts: Metabolic reconstructions for diverse microbes will lead to new applications in industrial, medical and environmental contexts. This project will provide undergraduate students in biology and computer science with hands-on experience in interdisciplinary research that will prepare them for graduate work or other scientific activity. The research activities will also feed back into courses in bioinformatics, scientific computing, and microbiology. The program will draw on Hope College's existing collaborations with area community colleges and the local public school system to provide research opportunities to students from underrepresented groups in the sciences.
This project involved a collaborative effort among researchers in computer science and microbiology to produce new tools for modeling microbial metabolism. Our software (available at www.theseed.org/models and www.cs.hope.edu/cytoseed ) processes DNA sequence data from a microbe to determine the biochemical pathways used by the organism to transform nutrients into biomass and energy. The end result is a metabolic model containing all the compounds, biochemical reactions and genes used by the organism’s metabolism. Prior to this project, creation of metabolic models for microbes was largely a manual process that took over a year on average. Using our ModelSEED software, scientists can now produce metabolic models that are ready for analysis in approximately 48 hours. These models still need manual refinement, so we have developed the CytoSEED software to visualize the biochemical pathways and aid scientists in the refinement process. To date, approximately 1800 users have used the ModelSEED to create approximately 17000 metabolic models. We have used the ModelSEED and CytoSEED ourselves to create, analyze and refine metabolic models for two organism groups, Shewanella (26 models), which is a bacterial genus of interest for environmental remediation, and Mycoplasma (4 models), which is a bacterial genus implicated in disease. Finally, all of our research has been conducted at Hope College, in Holland, Michigan, which is an undergraduate liberal arts college with a strong history of involving undergraduate students in research. Approximately 25 undergraduate students and 2 high school students have been involved with various aspects of the research activities. Six of these are co-authors on scientific publications, and most have participated in a public presentation of their research.