Next Generation Sequencing (NGS) has fundamentally changed the way we study living systems;however, many important questions remain intractable to this powerful approach, due to limitations in read length and accuracy. For example, 95% of all genes in the human transcriptome are thought to be alternatively spliced, with an average of 7 splice junctions per gene. The short reads obtained with current sequencers, however, are unable to span multiple junctions and thus cannot fully characterize this variation. In the study of bacterial and viral evolution and in cataloging tumor-specific chromosomal rearrangements in cancer, accurate long reads are also a key enabling technology. For important biological questions like these to be addressed, new methods are needed that provide more accurate, longer read-length sequencing. The objective of our research is to develop a breakthrough technology to significantly enhance the accuracy and read length of NGS platforms using single-molecule barcoding implemented in droplet-based microfluidics. Each multi-kilobase long molecule will be isolated in a droplet microreactor, amplified, fragmented, and barcoded with a sequence unique to the drop and, thus, to the molecule. Using droplet-based microfluidics, we will barcode thousands of molecules per second-the rate at which these techniques can form, split, inject, and incubate drops. This will allow us to barcode millions of molecules in minutes, far exceeding what is possible with other microfluidic systems and the scale needed to utilize the full capacity of NGS platforms and maximally exploit the barcoding concept. ?Aim 1: Develop microfluidic hardware to isolate, amplify, fragment, and barcode DNA for sequencing ? Aim 2: Develop bioinformatics software for DNA reconstruction;validate the approach Impact: Our technology converts the excess depth of short-read deep sequencing into highly accurate long reads. This core capability will have numerous impacts: 1) It will greatly simplify genome assembly by increasing accuracy and read length, allowing currently "inaccessible" portions of the genome to be sequenced. 2) It will allow a greater fraction of reads to be mapped to scaffolds, reducing the depth of sequencing required to obtain a sequence of a desired coverage, thereby reducing the cost of sequencing. 3) It will allow complete interrogation of splice variation in transcriptomes at the isoform level y allowing transcripts to be sequenced in their entirety with multifold coverage, irrespective of splice structure. 4) It will allow high-confidence identification of chromosomal rearrangements in cancer by increasing sequence accuracy and read length enabling accurate de novo assembly. 5) It will allow investigation of bacterial and viral hyper- evolution in persons with persistent infection by allowing "hot spot" regions to be sequenced for each microbe individually. Thus, our work will have impacts in genomics, systems biology, cancer, and microbial evolution. Indeed, since our technology markedly increases the read length and accuracy of sequencing platforms, and since sequencing has already had a transformative impact on the biological sciences, we anticipate broad and sustained impacts in basic and clinical areas of research.

Public Health Relevance

The objective of our research is to develop a breakthrough technology to significantly enhance the accuracy and read length of DNA and RNA sequencing to enhance our ability to understand the genetic basis of organisms. Ultimately, our goal is to use this information to enable the personalization of treatments for infectious diseases and cancer.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Exploratory/Developmental Grants (R21)
Project #
Application #
Study Section
Enabling Bioanalytical and Imaging Technologies Study Section (EBIT)
Program Officer
Schloss, Jeffery
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Pharmacy
San Francisco
United States
Zip Code
Sciambi, Adam; Abate, Adam R (2015) Accurate microfluidic sorting of droplets at 30 kHz. Lab Chip 15:47-51
Sciambi, Adam; Abate, Adam R (2014) Generating electric fields in PDMS microfluidic devices with salt water electrodes. Lab Chip 14:2605-9
Uricchio, Lawrence H; Hernandez, Ryan D (2014) Robust forward simulations of recurrent hitchhiking. Genetics 197:221-36
Szpiech, Zachary A; Hernandez, Ryan D (2014) selscan: an efficient multithreaded program to perform EHH-based scans for positive selection. Mol Biol Evol 31:2824-7
Eastburn, Dennis J; Sciambi, Adam; Abate, Adam R (2014) Identification and genetic analysis of cancer cells with PCR-activated cell sorting. Nucleic Acids Res 42:e128