Microbial communities are the primary drivers of the global biogeochemical cycles that maintain the nutrient balance of ecosystems and ultimately shape overall ecosystem function. Today, we appreciate the critical role microbes play in the biochemical processing of nutrient elements; yet our understanding of how the structure of microbial communities influences the suite of biogeochemical processes within a given nutrient cycle is somewhat rudimentary. Limited understanding of structure-function relationships between microbial communities and biogeochemical cycles is due in large part to technological limitations in characterizing the ecology of microbial communities. Until recently, microbial ecologists simply had no means of unambiguously characterizing the richness and evenness of species within a microbial community and the prevailing biochemical capabilities of constituent microbial populations. Using high-throughput DNA sequencing and three methodological approaches,shotgun metagenomics, PCR amplicon sequencing, and genomics microbial ecologists are beginning to unveil the inner workings of microbial communities and connect genetic details with biogeochemical processes. While this work holds great promise for the advancement of microbial ecology, currently-available high-throughput sequencing technologies are not ideally suited to the high-sample throughput demands of ecosystem science, the small genome size of bacteria and viruses, and their genetic novelty. Moreover, in some cases the failure to adequately ground-truth application of next-generation DNA sequencers to environmental DNA samples has resulted in biased data and erroneous scientific conclusions. This "high-risk; high-reward" research seeks to explore and test the use of a new, next-generation DNA sequencer, the PacBio RS, which has several attributes that may make it better suited to the specific needs of microbial ecology research and has the potential to be highly transformative to this geoscience discipline. A series of controlled and carefully replicated experiments will be conducted that will test the use of PacBio sequencing for shotgun metagenomics, 16S PCR amplicon sequencing, and single cell genome sequencing. This project will leverage existing datasets from other high-throughput sequencing platforms (e.g., Illumina and 454) to directly compare the performance of PacBio in each of these application areas. Through a NSF Major Research Instrumentation award to the University of Delaware, the PIs will have access to one of the few PacBio RS instruments available at an academic institution. Ultimately, these investigations will constrain the experimental error within PacBio sequencing and serve as an initial demonstration of the utility of the instrument for microbial ecology research. The Broader Impacts of this proposal includes an effort to understand and constrain the sources of error and other biases within PacBio sequencing, and make technical recommendations that will shape the optimal use of the instrument within microbial science. In the course of this work, the PIs will mentor a Ph.D. graduate student and a post-doctoral researcher, and provide open access to all project data and findings.

Project Report

Direct application of high-throughput DNA sequencing (HTS) technology to the analysis of environmental DNA has provided many of the most transformative scientific discoveries in microbiology within the past five years. At the core of this renaissance has been the ability to accurately describe the species composition of microbial communities within a broad range of environments at unprecedented resolution and depth. However, environmental microbiologists have only begun to tap into the scientific promise of HTS for understanding the influence of microbial communities on larger ecosystems. Part of the limitation to reaching the full scientific potential of HTS in microbiological research has come from the sequencing technology itself. In particular, HTS technologies are limited by short DNA sequence read lengths and low sample throughput. The dominant HTS platforms in widespread use are simply not designed for the demands of environmental microbiology research. In April, 2011 Pacific Biosciences Corp. released an instrument, the PacBio RS, based on a new technological approach to high-throughput DNA sequencing — Single Molecule Real Time (SMRT) sequencing. The novel engineering of the PacBio RS addresses several limitations of current high-throughput DNA sequencers, namely sample throughput, assay cost, and sequence read length. The project investigated the application of PacBio sequencing to three common microbial ecology research problems: whole genome sequencing; community profiling by PCR amplicon sequencing; and shotgun metagenomic sequencing. In each of these investigations, PacBio benefited scientific output over more established sequencing technologies. The team successfully showed that PacBio was superior to short read technologies for sequencing bacterial genomes, especially genomes that had proven intractable to closure with short read technologies. This work was published in the Journal of Genome Announcements (DeBruyn, et al. 2014) The long DNA sequence reads resulting from PacBio were critical to uncovering gene associations shaping the biology of unknown marine viruses. In particular, the team found that the gene encoding Ribonucleotide Reductase could broadly predict the identity of bacteria infected by an unknown marine bacteriophage (virus of a bacteria) as well as the physiological conditions necessary for the virus to replicate. This work was reported in the Proceedings of the National Academy of Sciences (Sakowski, et al. 2014). In carefully controlled experiments the team was able to identify systemic bias in a commonly used sample preparation technique, bias that was shown to substantially alter scientific conclusions surrounding the composition of a microbial community based on metagenome DNA sequence data. This work was published in the journal Microbiome (Marine, et al., 2014). The research activities of the project provided high-quality training experiences for undergraduates and graduate students. In all, three undergraduates and four graduate students were intimately involved in the project and provided much of the scientific output. Nearly all of the student participants were authors on publications resulting from the project. Contributions such as these can be critical in their preparation for scientific careers. Publications resulting from this project thus far: DeBruyn, J.M., Radosevich, M., Wommack, K.E., Polson, S.W., Hauser, L.J., Fawaz, M.N., Korlach, J., and Tsai, Y.-C. (2014). Genome Sequence and Methylome of Soil Bacterium Gemmatirosa kalamazoonensis KBS708T, a Member of the Rarely Cultivated Gemmatimonadetes Phylum. Genome announcements 2. Marine, R., McCarren, C., Vorrasane, V., Nasko, D., Crowgey, E., Polson, S.W., and Wommack, K.E. (2014). Caught in the middle with multiple displacement amplification: the myth of pooling for avoiding multiple displacement amplification bias in a metagenome. Microbiome 2, 3. Sakowski, E.G., Munsell, E.V., Hyatt, M., Kress, W., Williamson, S.J., Nasko, D.J., Polson, S.W., and Wommack, K.E. (2014). Ribonucleotide reductases reveal novel viral diversity and predict biological and ecological features of unknown marine viruses. Proc Natl Acad Sci e-pub ahead of print.

Agency
National Science Foundation (NSF)
Institute
Division of Ocean Sciences (OCE)
Type
Standard Grant (Standard)
Application #
1148118
Program Officer
David Garrison
Project Start
Project End
Budget Start
2011-09-15
Budget End
2014-08-31
Support Year
Fiscal Year
2011
Total Cost
$200,000
Indirect Cost
Name
University of Delaware
Department
Type
DUNS #
City
Newark
State
DE
Country
United States
Zip Code
19716