The microbiome is now known to influence the onset, progression, and treatment of cancers both within the gastrointestinal tract and systemically. Whole-metagenome shotgun sequencing (WMS) provides the highest resolution profiling of the human bacterial, archeal, and viral microbiome available, however, it remains expensive to perform. Furthermore, the analysis of all microbiome data, including WMS and the less expensive 16S ribosomal RNA amplicon sequencing, is hindered by inability to systematically compare the results to previous studies and experiments. Thus, high quality re-analysis of public microbiome data offers the opportunity for rapid and cost-effective elucidation of the roles of bacteria, viruses, and microbial function in the etiology and progression of cancer. This proposal efficiently extracts new value from published microbiome research through three aims. First, it improves the interpretability of cancer-linked microbiome profiles by translating concepts from Gene Set Enrichment Analysis and developing microbial signature resources. Second, it develops new methods to identify strain-level microbial features, fungi, human viruses, and bacteriophages from WMS data and applies these to thousands of available cancer-associated metagenomes and controls. Finally, it identifies microbiota, community structure and functions relevant in the development or inhibition of cancer by pooled analysis and meta-analysis of publicly available human microbiome profiles, and makes these newly processed data and manually curated clinical data conveniently available to the cancer research community for further interrogation. This contribution is significant because it increases the likelihood of identifying new microbiome correlates of cancer, of correctly distinguishing causal factors from artifacts of confounding or technical batches, and of developing effective public health interventions based on the human microbiome. The proposed research is innovative because it identifies and corrects important deficiencies in how microbiome data are processed, interpreted, and made available for re-use on a large scale by other research teams.

Public Health Relevance

The human microbiome is implicated in the development and response to treatment of some cancers, including infectious agents estimated to be responsible for ~18% of the global cancer burden. This project improves the ability to identify new roles of the human microbiome in cancer by 1) enabling comprehensive comparisons of microbiome studies to previously published results and known microbial physiology, 2) developing higher- resolution approaches to identifying viruses and bacterial strains from metagenomic shotgun data, and 3) making all methods and resources easily usable by a broad research community.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Lai, Gabriel Y
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Graduate School of Public Health and Health Policy
Public Health & Prev Medicine
Graduate Schools
New York
United States
Zip Code