While metagenomics can reveal genetic composition (and therefore the genetic potential) of microbial communities, other meta-omic (e.g., metatranscriptomic and metaproteomic) techniques can provide additional insights on functional characteristics of the communities, such as gene activities and their regulation mechanisms. Analyzing these functional microbiome data sets raises new computational challenges. In this application, we propose novel approaches to metatranscriptomic and metaproteomic data analyses, using de Bruijn graph representations of metagenome assemblies as the reference, enabling an integrated analysis of meta-omic data sets. The advantages of using de Bruijn graphs include: 1) they provide a compact representation of metagenomes (which are likely to be redundant) and allow direct computation on the graph, 2) they naturally capture genomic variations;and 3) they capture the """"""""ambiguous"""""""" connectivity between contigs/scaffolds, which can be resolved in subsequent steps using additional information, or utilized to achieve better identification and quantification of genes and proteins using metatranscriptomic and metaproteomic data, respectively. We will apply our new tools to analyzing functional human microbiome data sets, including those to be generated from HMP phase II projects.
We propose to develop graph-centric approaches to metatranscriptomic and metaproteomic data analysis. These approaches will be a timely addition to the computational tools that are central to the interpretation and integration of metagenomic and other functional microbiome data, leading to a better understanding of the functionality and dynamics of microbial communities, and of their responses to environmental changes, e.g. health conditions of their human hosts.
|Jiao, Dazhi; Han, Wontack; Ye, Yuzhen (2017) Functional association prediction by community profiling. Methods 129:8-17|
|Zhang, Quan; Ye, Yuzhen (2017) Not all predicted CRISPR-Cas systems are equal: isolated cas genes and classes of CRISPR like elements. BMC Bioinformatics 18:92|
|Han, Wontack; Wang, Mingjie; Ye, Yuzhen (2017) A concurrent subtractive assembly approach for identification of disease associated sub-metagenomes. Res Comput Mol Biol 2017:18-33|
|Ye, Yuzhen; Zhang, Quan (2016) Characterization of CRISPR RNA transcription by exploiting stranded metatranscriptomic data. RNA 22:945-56|
|Li, Sujun; Tang, Haixu (2016) Computational Methods in Mass Spectrometry-Based Proteomics. Adv Exp Med Biol 939:63-89|
|Tang, Haixu; Li, Sujun; Ye, Yuzhen (2016) A Graph-Centric Approach for Metagenome-Guided Peptide and Protein Identification in Metaproteomics. PLoS Comput Biol 12:e1005224|
|Ye, Yuzhen; Tang, Haixu (2016) Utilizing de Bruijn graph of metagenome assembly for metatranscriptome analysis. Bioinformatics 32:1001-8|
|Wang, Mingjie; Doak, Thomas G; Ye, Yuzhen (2015) Subtractive assembly for comparative metagenomics, and its application to type 2 diabetes metagenomes. Genome Biol 16:243|
|Bao, Guanhui; Wang, Mingjie; Doak, Thomas G et al. (2015) Strand-specific community RNA-seq reveals prevalent and dynamic antisense transcription in human gut microbiota. Front Microbiol 6:896|