The advent of next generation sequencing technologies has dramatically enhanced the ability to detect sub- populations of cells and expanding our fundamental understanding of organismal biology. However, typical sequencing protocols use bulk DNA or RNA mixed from thousands to millions of cell as input, obscuring the specific sequencing information from any given cell. The only way to directly study cellular heterogeneity is to perform sequencing analysis of individual cells. Development of single-cell sequencing (SCS) technologies has enabled systematic investigation of cellular heterogeneity in a wide range of tissues and cell populations. However, significant challenges remain. Chief among them are high cost, low throughput, reliance on customized or commercially unavailable equipment, and limited ability to accurately detect low frequency single nucleotide variants. As such, there is a need to ?democratize? SCS by reducing or eliminating these issues. To that end, our proposal makes use of a new ligation-based approach to combinatorial cellular indexing that dramatically increases the number of individual cells that can be assayed while eliminating the need for customized equipment. Our original approach, which we originally termed Split-Pool Ligation-based Transcriptomic Sequencing (SPLiT-Seq), is able to deconvolve the transcriptional profiles of >150,000 individual cells with >99.9% accuracy. This approach makes use of the concept of combinatorial cellular indexing which ligates a unique combination of short barcode sequences to all the nucleic acids in each cell, such that all reads sharing this combination can be definitively determined to be derived from the same cell. Importantly, this approach is not inherently limited to RNA. Therefore, this proposal aims to fully develop our ligation-based split-pool cellular indexing approach for use in DNA-based applications with a special emphasis on rare single nucleotide variant detection (SNV).
Specific Aim 1 will focus on strategies for in situ genome fragmentation and optimizing ligation and cellular indexing of genomic DNA. Low frequency SNV detection is difficult in SCS due to a combination of relatively high error-rates of modern sequencing platforms and errors introduced during sample preparation. Therefore, in Specific Aim 2, we propose to integrate our ultra-accurate Duplex Sequencing technology with our combinatorial cellular indexing approach.
Cellular genetic heterogeneity has a profound impact on human health and is imperative to understand. Current single-cell technologies are designed to study this phenomenon at the highest resolution, but are frequently expensive to perform, limited in throughput, and lack the ability to detect rare single-nucleotide variants. The objective of this proposal is to ?democratize? single-cell sequencing by developing a sequencing method capable of ultra-rare variant detection at the single-cell level with low cost, increased scalability, and high sensitivity.