While metagenomics can reveal genetic composition (and therefore the genetic potential) of microbial communities, other meta-omic (e.g., metatranscriptomic and metaproteomic) techniques can provide additional insights on functional characteristics of the communities, such as gene activities and their regulation mechanisms. Analyzing these functional microbiome data sets raises new computational challenges. In this application, we propose novel approaches to metatranscriptomic and metaproteomic data analyses, using de Bruijn graph representations of metagenome assemblies as the reference, enabling an integrated analysis of meta-omic data sets. The advantages of using de Bruijn graphs include: 1) they provide a compact representation of metagenomes (which are likely to be redundant) and allow direct computation on the graph, 2) they naturally capture genomic variations; and 3) they capture the ambiguous connectivity between contigs/scaffolds, which can be resolved in subsequent steps using additional information, or utilized to achieve better identification and quantification of genes and proteins using metatranscriptomic and metaproteomic data, respectively. We will apply our new tools to analyzing functional human microbiome data sets, including those to be generated from HMP phase II projects.
We propose to develop graph-centric approaches to metatranscriptomic and metaproteomic data analysis. These approaches will be a timely addition to the computational tools that are central to the interpretation and integration of metagenomic and other functional microbiome data, leading to a better understanding of the functionality and dynamics of microbial communities, and of their responses to environmental changes, e.g. health conditions of their human hosts.