This project proposes an innovative, low-cost whole-genome sequencing (WGS) test for the genome-wide delineation of structural variation (SV) to capture the full spectrum of pathogenic variation that is currently detected by two lower-resolution methods. Genome-wide SV studies are the ACMG recommended first-tier diagnostic screen for a myriad of congenital anomalies, including autism spectrum disorder. Diagnostic SV screening relies on karyotyping or chromosomal microarray (CMA). Karyotyping can detect balanced chromosomal rearrangements (BCRs) at microscopic resolution but is insensitive to smaller alterations, whereas CMA can detect copy number variants (CNVs) but not BCRs. At present, no validated genome-wide method can identify both BCRs and CMA-resolution CNVs in a single diagnostic test, leaving inevitable blind spots depending on the technology chosen. Moreover, submicroscopic, or `cryptic', BCRs are intractable to all conventional diagnostics and represent one of the last unexplored spaces of genomic variation. We have shown that large-insert WGS, or `jumping' libraries, can delineate both BCRs and CNVs in a research capacity. We have also recently demonstrated its clinical potential by providing an in utero prenatal diagnosis (Talkowski et al., 2012, N Engl J Med). In a paper published back-to-back with our prenatal sequencing, Co-I Wapner and colleagues validated CMA as the preferred method for prenatal diagnostics through an NICHD consortium of 4,340 prenatal samples with both karyotyping and CMA (Wapner et al., 2012, N Engl J Med). Here, we propose to validate jumping library sequencing for routine SV detection in these well-characterized prenatal samples. We will perform a critical validation in Aim 1 to determine the sensitivity of our sequencing method to capture all pathogenic SVs detected by karyotyping and CMA, with the benefit of precise sequence resolution for gene discovery.
In Aim 2, we will calibrate the added diagnostic value of cryptic SVs in at least 400 trios from the highest yield diagnostic cohort (ultrasound defects) for which conventional methods failed to detect a causative mutation. Our preliminary data suggest that cryptic SVs account for 5.3-9.4% of pathogenic mutations; an important component of the diagnostic yield that is presently uncharacterized.
In Aim 3, we will integrate CMA data from a consortium of academic and commercial diagnostic sites with the exome aggregation project at the Broad Institute, which will collectively represent an amalgamation of >200,000 subjects. We will compare diagnostic yield from conventional criteria to a quantitative risk score based on the convergence of genomics datasets. The final product will be the validation of a single sequencing platform to overcome the limitations of two lower-resolution methods, the determination of clinical yield from a currently uncharacterized class of genomic variation, and the creation of a publicly accessible genome annotation resource. This project could have an immediate and transformative impact on genetic diagnostic practice.

Public Health Relevance

This project will evaluate a new technology that will sequence the entire genome using a single test that can capture all of the structural changes in the chromosomes that currently requires two different genetic diagnostic tests. We will also define the clinical significance to prenatal diagnostics of a class of genetic variation (called `cryptic' structural variation) that is not currently detected by any diagnostic technology, and produce an annotation of the genes implicated in human disease by combining data from more than 200,000 subjects. At its conclusion, this study could transform current approaches to genetic testing for structural variation.

National Institute of Health (NIH)
Eunice Kennedy Shriver National Institute of Child Health & Human Development (NICHD)
Research Project (R01)
Project #
Application #
Study Section
Genetics of Health and Disease Study Section (GHD)
Program Officer
Coulombe, James N
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Massachusetts General Hospital
United States
Zip Code
Werling, Donna M; Brand, Harrison; An, Joon-Yong et al. (2018) An analytical framework for whole-genome sequence association studies and its implications for autism spectrum disorder. Nat Genet 50:727-736
Halgren, Christina; Nielsen, Nete M; Nazaryan-Petersen, Lusine et al. (2018) Risks and Recommendations in Prenatally Detected De Novo Balanced Chromosomal Rearrangements from Assessment of Long-Term Outcomes. Am J Hum Genet 102:1090-1103
An, Joon-Yong; Lin, Kevin; Zhu, Lingxue et al. (2018) Genome-wide de novo risk score implicates promoter variation in autism spectrum disorder. Science 362:
Loh, Po-Ru; Genovese, Giulio; Handsaker, Robert E et al. (2018) Insights into clonal haematopoiesis from 8,342 mosaic chromosomal alterations. Nature 559:350-355
Cretu Stancu, Mircea; van Roosmalen, Markus J; Renkens, Ivo et al. (2017) Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun 8:1326
Collins, Ryan L; Brand, Harrison; Redin, Claire E et al. (2017) Defining the diverse spectrum of inversions, complex structural variation, and chromothripsis in the morbid human genome. Genome Biol 18:36
Shaw, Natalie D; Brand, Harrison; Kupchinsky, Zachary A et al. (2017) SMCHD1 mutations associated with a rare muscular dystrophy can also cause isolated arhinia and Bosma arhinia microphthalmia syndrome. Nat Genet 49:238-248
Redin, Claire; Brand, Harrison; Collins, Ryan L et al. (2017) The genomic landscape of balanced cytogenetic abnormalities associated with human congenital anomalies. Nat Genet 49:36-45
Ordulu, Zehra; Kammin, Tammy; Brand, Harrison et al. (2016) Structural Chromosomal Rearrangements Require Nucleotide-Level Resolution: Lessons from Next-Generation Sequencing in Prenatal Diagnosis. Am J Hum Genet 99:1015-1033