Microbiome research through sequencing is becoming increasingly important for clinical studies. The human commensal microbiomes have been shown to have a wide variety of potential health impacts. However, our ability to genetically assay microbes is still limited. Microbiomes are extremely complex and standard short-read sequencing technologies often do not provide sufficient basis for the recovery of relevant genes and organisms. Low-input and low-cost linked-read DNA sequencing technologies, such as the 10x Genomics chromium system, have recently emerged with unprecedented promise for de novo assembly of whole genome or metagenome samples. These technologies employ a novel molecular barcoding technique which offers long-range information over standard high-throughput short read, next-generation sequencing, while still at reasonable reagent and low-costs. We plan to develop several innovative novel algorithms to fully leverage barcoded reads in a fast manner to improve several integral and challenging applications, in particular: improving metagenome assembly and leveraging the increased sensitivity to low abundance genomic information in order to identify clinically relevant and potentially pathogenic organisms that can inform clinical decisions. All our proposed methods and computational tools will be made freely available with extensive documentations for the community to use. To ensure the utility of our methods we plan to extensively apply them to a wide range of research and clinical shotgun metagenome data sets, in my laboratory and through various established local, external and industrial collaborations. We also plan to collect control samples and sequence them using multiple platforms (Illumina, 10x Genomics, Loop Genomics Read Cloud, UTS TELL-SEQ, Oxford Nanopore) for benchmarking. We will also use our proposed methods to improve the detection and classification of low abundance organisms in clinical samples. We will launch two pilot projects in collaborations with our Department of Pathology and Hospital for Special Surgery (HSS). Successful completion of this project will provide fast and scalable computational methods that can be applied to large-scale data sets.

Public Health Relevance

This work represents novel computational developments, improvements and applications that leverage inked-read sequencing technologies, which can more completely characterize metagenome DNA sequences. We also plan to build robust benchmarks and show the applicability of our algorithms to complex and low-abundance samples and pioneer a new approach to Precision Metagenomics.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Unknown (R35)
Project #
1R35GM138152-01
Application #
10029180
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Brazhnik, Paul
Project Start
2020-09-01
Project End
2025-06-30
Budget Start
2020-09-01
Budget End
2021-06-30
Support Year
1
Fiscal Year
2020
Total Cost
Indirect Cost
Name
Weill Medical College of Cornell University
Department
Physiology
Type
Schools of Medicine
DUNS #
060217502
City
New York
State
NY
Country
United States
Zip Code
10065