Even as new technologies continue to drive down the cost of DNA sequencing, we are in critical need of equivalently powerful methods informing long-range contiguity to support both de novo genome assembly and haplotype-resolved genome resequencing. With funding through this program, we have explored diverse approaches for low-cost, massively parallel capture of contiguity information. Our progress is substantial, and includes the development of a method for in situ library construction and optical sequencing, a method in which we exploit 'contact probability maps' to produce the first chromosome-scale de novo mammalian genome assemblies based exclusively on short reads, and a method that combines contiguity preserving transposition and combinatorial indexing for accurate, megabase-scale haplotype-resolved human genome resequencing. We have also demonstrated the remarkable value of contiguity information through signature projects, including the first accurate, non-invasive prediction of a fetal genome, and the first haplotype-resolved sequencing of a cancer genome and epigenome. In this renewal application, we propose to narrow our focus to the advanced development of our two most promising approaches, namely contact probability mapping (Aim 1) and contiguity preserving transposition (Aim 2). We will then formally evaluate these methods for cost, performance and scalability, while also seeking to integrate them with one another and with emerging sequencing paradigms (Aim 3). Coupled with a modest drop in the per-base cost of short read DNA sequencing, these methods will enable chromosome-scale de novo assembly of large genomes as well as chromosome-scale haplotype-resolved human genome resequencing for about $1,000.
As we enter an era of personalized medicine, a deep understanding of the human genome will be increasingly important to public health, contributing to the unraveling of the genetic basis of human disease, as well as serving an increasing role in clinical diagnostics. The technologies developed by this project will accelerate progress towards these goals by enabling the affordable and comprehensive sequencing of individual human genomes. These same technologies will also facilitate the accurate sequencing and assembly of the genomes of other species, which inform our understanding of the human genome through comparative analysis.
Cusanovich, Darren A; Hill, Andrew J; Aghamirzaie, Delasa et al. (2018) A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility. Cell 174:1309-1324.e18 |
Cusanovich, Darren A; Reddington, James P; Garfield, David A et al. (2018) The cis-regulatory dynamics of embryonic development at single-cell resolution. Nature 555:538-542 |
Smukowski Heil, Caiti; Burton, Joshua N; Liachko, Ivan et al. (2018) Identification of a novel interspecific hybrid yeast from a metagenomic spontaneously inoculated beer sample using Hi-C. Yeast 35:71-84 |
Kronenberg, Zev N; Fiddes, Ian T; Gordon, David et al. (2018) High-resolution comparative analysis of great ape genomes. Science 360: |
Cao, Junyue; Packer, Jonathan S; Ramani, Vijay et al. (2017) Comprehensive single-cell transcriptional profiling of a multicellular organism. Science 357:661-667 |
Bickhart, Derek M; Rosen, Benjamin D; Koren, Sergey et al. (2017) Single-molecule sequencing and chromatin conformation capture enable de novo reference assembly of the domestic goat genome. Nat Genet 49:643-650 |
Gasperini, Molly; Findlay, Gregory M; McKenna, Aaron et al. (2017) CRISPR/Cas9-Mediated Scanning for Regulatory Elements Required for HPRT1 Expression via Thousands of Large, Programmed Genomic Deletions. Am J Hum Genet 101:192-205 |
Ramani, Vijay; Deng, Xinxian; Qiu, Ruolan et al. (2017) Massively multiplex single-cell Hi-C. Nat Methods 14:263-266 |
Gordon, David; Huddleston, John; Chaisson, Mark J P et al. (2016) Long-read sequence assembly of the gorilla genome. Science 352:aae0344 |
Salipante, Stephen J; Adey, Andrew; Thomas, Anju et al. (2016) Recurrent somatic loss of TNFRSF14 in classical Hodgkin lymphoma. Genes Chromosomes Cancer 55:278-87 |
Showing the most recent 10 out of 27 publications