High-Performance Validation and Classification of Metagenomic Ribosomal-RNA Sequences. Innovations in culture-independent studies of environmental DNA sequences (i.e., metagenomics), coupled with rapidly advancing DNA sequencing capabilities, have altered profoundly the volume of sequence data that can be processed in a study. However several bottlenecks to metagenomic data analysis must be overcome as production is scaled up and findings are generalized. These include detection and culling of human and chimeric sequences;removal/correction of sequencing errors;accurate assessment of biodiversity;accurate taxonomic classification of sequences;and analysis of microbial eukaryotes in metagenomic specimens. Our overall objective is to build a framework for evaluating and insuring the quality of primary sequence data and associated phylogenetic metadata. Because rRNA-based phylogenetic analysis remains an essential means of organizing and interpreting the analyses of other metagenomic sequences, we focus in this proposed project on quality assurance issues related to rRNA sequence data. Specifically, we propose to build a software infrastructure based on a high-precision alignment tool (INFERNAL) that addresses many of the critical barriers to progress facing metagenomic research programs. Rigorous rRNA sequence alignment is a strict requirement for accurate sequence-based phylogenetic classification of microorganisms in metagenomic samples. The open-source INFERNAL alignment software developed by Prof. Sean Eddy (Co-Investigator) and colleagues permits a level of analysis that extends far beyond other widely-used automated sequence aligners. This base technology, developed to identify and annotate RNA genes in genomes in conjunction with the Rfam database, offers opportunity to develop and incorporate features that could significantly reduce current barriers to metagenomic analysis. INFERNAL uses consensus RNA primary and secondary structure (a covariance model;CM) to guide alignment. Calculation of position-specific measures of alignment uncertainty allows detection of poorly aligned sequences and alignment positions, which can be removed prior to downstream applications, for example phylogenetic inference. INFERNAL-based CM alignment can be used, therefore, as a sensitive mechanism for detecting and eliminating anomalous sequences (e.g., chimeras, non-rRNA sequences) and sequencing errors from datasets. In this two-year project, we propose a leveraged scheme in which the utility of the INFERNAL technology is adapted to the needs of the metagenomics community through joint development by the Pace and Eddy groups. In this proposal the Eddy lab (fully funded by HHMI) will continue to develop the core technology and functionality enhancements of INFERNAL, while the Pace lab (as funded by this grant) will use their extensive background in rRNA phylogenetic analyses to build and validate software tools that extend the basic feature set of INFERNAL, with special emphasis on facilitating research carried out in the Human Microbiome Project. 1

Public Health Relevance

Innovations in culture-independent microbiology (i.e., metagenomics) now permit detailed analyses of complex microbial populations, such as those that contribute to the health and well-being of humans. Rapidly advancing DNA sequencing capabilities have altered profoundly the volume of sequence data that can be processed in a study. However several bottlenecks to the analysis of this DNA sequence data must be overcome as the scale of studies expands. These include several issues concerned with the quality assurance of primary DNA sequence data, as well as interpretation of results drawn from these data, for instance the accuracy of identifying microorganisms in a specimen based solely on DNA sequence. In this project, we propose to build a software infrastructure based on a high-precision DNA sequence analysis tool (INFERNAL), that addresses many of the critical barriers to progress currently facing researchers in the metagenomics field. In this two-year project, the base software technology, developed by Prof. Eddy (Co-Investigator) and colleagues to identify and annotate RNA genes in genomes, will be adapted to the needs of the metagenomics community through joint development by the Pace and Eddy groups. This research team will use their extensive backgrounds in RNA structural biology, molecular-evolution, and computational biology to build and validate software tools that extend the basic feature set of INFERNAL, with special emphasis on facilitating research carried out in the NIH Human Microbiome Project. 1

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21HG005964-01
Application #
8021062
Study Section
Special Emphasis Panel (ZRG1-GGG-N (50))
Program Officer
Proctor, Lita
Project Start
2010-09-27
Project End
2011-06-30
Budget Start
2010-09-27
Budget End
2011-06-30
Support Year
1
Fiscal Year
2010
Total Cost
$227,250
Indirect Cost
Name
University of Colorado at Boulder
Department
Biochemistry
Type
Schools of Arts and Sciences
DUNS #
007431505
City
Boulder
State
CO
Country
United States
Zip Code
80309
Alkanani, Aimon K; Hara, Naoko; Gottlieb, Peter A et al. (2015) Alterations in Intestinal Microbiota Correlate With Susceptibility to Type 1 Diabetes. Diabetes 64:3510-20
Hauser, Leah J; Feazel, Leah M; Ir, Diana et al. (2015) Sinus culture poorly predicts resident microbiota. Int Forum Allergy Rhinol 5:3-9
Ramakrishnan, Vijay R; Hauser, Leah J; Feazel, Leah M et al. (2015) Sinus microbiota varies among chronic rhinosinusitis phenotypes and predicts surgical outcome. J Allergy Clin Immunol 136:334-42.e1
Krebs, Nancy F; Sherlock, Laurie G; Westcott, Jamie et al. (2013) Effects of different complementary feeding regimens on iron status and enteric microbiota in breastfed infants. J Pediatr 163:416-23
Robertson, Charles E; Harris, J Kirk; Wagner, Brandie D et al. (2013) Explicet: graphical user interface software for metadata-driven management, analysis and visualization of microbiome data. Bioinformatics 29:3100-1
Ramakrishnan, Vijay R; Feazel, Leah M; Gitomer, Sarah A et al. (2013) The microbiome of the middle meatus in healthy adults. PLoS One 8:e85507
Markle, Janet G M; Frank, Daniel N; Mortin-Toth, Steven et al. (2013) Sex differences in the gut microbiome drive hormone-dependent regulation of autoimmunity. Science 339:1084-8
Tong, Maomeng; Li, Xiaoxiao; Wegener Parfrey, Laura et al. (2013) A modular organization of the human intestinal mucosal microbiota and its association with inflammatory bowel disease. PLoS One 8:e80702
Wu, Xiao; Berkow, Kathryn; Frank, Daniel N et al. (2013) Comparative analysis of microbiome measurement platforms using latent variable structural equation modeling. BMC Bioinformatics 14:79
Robertson, Charles E; Baumgartner, Laura K; Harris, J Kirk et al. (2013) Culture-independent analysis of aerosol microbiology in a metropolitan subway system. Appl Environ Microbiol 79:3485-93

Showing the most recent 10 out of 21 publications