Next Generation Sequencing (NGS) has fundamentally changed the way we study living systems;however, many important questions remain intractable to this powerful approach, due to limitations in read length and accuracy. For example, 95% of all genes in the human transcriptome are thought to be alternatively spliced, with an average of 7 splice junctions per gene. The short reads obtained with current sequencers, however, are unable to span multiple junctions and thus cannot fully characterize this variation. In the study of bacterial and viral evolution and in cataloging tumor-specific chromosomal rearrangements in cancer, accurate long reads are also a key enabling technology. For important biological questions like these to be addressed, new methods are needed that provide more accurate, longer read-length sequencing. The objective of our research is to develop a breakthrough technology to significantly enhance the accuracy and read length of NGS platforms using single-molecule barcoding implemented in droplet-based microfluidics. Each multi-kilobase long molecule will be isolated in a droplet microreactor, amplified, fragmented, and barcoded with a sequence unique to the drop and, thus, to the molecule. Using droplet-based microfluidics, we will barcode thousands of molecules per second-the rate at which these techniques can form, split, inject, and incubate drops. This will allow us to barcode millions of molecules in minutes, far exceeding what is possible with other microfluidic systems and the scale needed to utilize the full capacity of NGS platforms and maximally exploit the barcoding concept. ?Aim 1: Develop microfluidic hardware to isolate, amplify, fragment, and barcode DNA for sequencing ? Aim 2: Develop bioinformatics software for DNA reconstruction;validate the approach Impact: Our technology converts the excess depth of short-read deep sequencing into highly accurate long reads. This core capability will have numerous impacts: 1) It will greatly simplify genome assembly by increasing accuracy and read length, allowing currently """"""""inaccessible"""""""" portions of the genome to be sequenced. 2) It will allow a greater fraction of reads to be mapped to scaffolds, reducing the depth of sequencing required to obtain a sequence of a desired coverage, thereby reducing the cost of sequencing. 3) It will allow complete interrogation of splice variation in transcriptomes at the isoform level y allowing transcripts to be sequenced in their entirety with multifold coverage, irrespective of splice structure. 4) It will allow high-confidence identification of chromosomal rearrangements in cancer by increasing sequence accuracy and read length enabling accurate de novo assembly. 5) It will allow investigation of bacterial and viral hyper- evolution in persons with persistent infection by allowing """"""""hot spot"""""""" regions to be sequenced for each microbe individually. Thus, our work will have impacts in genomics, systems biology, cancer, and microbial evolution. Indeed, since our technology markedly increases the read length and accuracy of sequencing platforms, and since sequencing has already had a transformative impact on the biological sciences, we anticipate broad and sustained impacts in basic and clinical areas of research.

Public Health Relevance

The objective of our research is to develop a breakthrough technology to significantly enhance the accuracy and read length of DNA and RNA sequencing to enhance our ability to understand the genetic basis of organisms. Ultimately, our goal is to use this information to enable the personalization of treatments for infectious diseases and cancer.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21HG007233-02
Application #
8708928
Study Section
Enabling Bioanalytical and Imaging Technologies Study Section (EBIT)
Program Officer
Smith, Michael
Project Start
2013-08-01
Project End
2015-06-30
Budget Start
2014-07-01
Budget End
2015-06-30
Support Year
2
Fiscal Year
2014
Total Cost
Indirect Cost
Name
University of California San Francisco
Department
Pharmacology
Type
Schools of Pharmacy
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94143
Siltanen, Christian A; Cole, Russell H; Poust, Sean et al. (2018) An Oil-Free Picodrop Bioassay Platform for Synthetic Biology. Sci Rep 8:7913
Kim, Samuel C; Clark, Iain C; Shahi, Payam et al. (2018) Single-Cell RT-PCR in Microfluidic Droplets with Integrated Chemical Lysis. Anal Chem 90:1273-1279
Demaree, Benjamin; Weisgerber, Daniel; Lan, Freeman et al. (2018) An Ultrahigh-throughput Microfluidic Platform for Single-cell Genome Sequencing. J Vis Exp :
Lan, Freeman; Demaree, Benjamin; Ahmed, Noorsher et al. (2017) Single-cell genome sequencing at ultra-high-throughput with microfluidic droplet barcoding. Nat Biotechnol 35:640-646
Sukovich, David J; Lance, Shea T; Abate, Adam R (2017) Sequence specific sorting of DNA molecules with FACS using 3dPCR. Sci Rep 7:39385
Shahi, Payam; Kim, Samuel C; Haliburton, John R et al. (2017) Abseq: Ultrahigh-throughput single cell protein profiling with droplet microfluidic barcoding. Sci Rep 7:44447
Lim, Shaun W; Lance, Shea T; Stedman, Kenneth M et al. (2017) PCR-activated cell sorting as a general, cultivation-free method for high-throughput identification and enrichment of virus hosts. J Virol Methods 242:14-21
Kim, Samuel C; Premasekharan, Gayatri; Clark, Iain C et al. (2017) Measurement of copy number variation in single cancer cells using rapid-emulsification digital droplet MDA. Microsyst Nanoeng 3:
Johnston, Henry Richard; Hu, Yi-Juan; Gao, Jingjing et al. (2017) Identifying tagging SNPs for African specific genetic variation from the African Diaspora Genome. Sci Rep 7:46398
Mathias, Rasika Ann; Taub, Margaret A; Gignoux, Christopher R et al. (2016) A continuum of admixture in the Western Hemisphere revealed by the African Diaspora genome. Nat Commun 7:12522

Showing the most recent 10 out of 27 publications