This project proposes an innovative, low-cost whole-genome sequencing (WGS) test for the genome-wide delineation of structural variation (SV) to capture the full spectrum of pathogenic variation that is currently detected by two lower-resolution methods. Genome-wide SV studies are the ACMG recommended first-tier diagnostic screen for a myriad of congenital anomalies, including autism spectrum disorder. Diagnostic SV screening relies on karyotyping or chromosomal microarray (CMA). Karyotyping can detect balanced chromosomal rearrangements (BCRs) at microscopic resolution but is insensitive to smaller alterations, whereas CMA can detect copy number variants (CNVs) but not BCRs. At present, no validated genome-wide method can identify both BCRs and CMA-resolution CNVs in a single diagnostic test, leaving inevitable blind spots depending on the technology chosen. Moreover, submicroscopic, or `cryptic', BCRs are intractable to all conventional diagnostics and represent one of the last unexplored spaces of genomic variation. We have shown that large-insert WGS, or `jumping' libraries, can delineate both BCRs and CNVs in a research capacity. We have also recently demonstrated its clinical potential by providing an in utero prenatal diagnosis (Talkowski et al., 2012, N Engl J Med). In a paper published back-to-back with our prenatal sequencing, Co-I Wapner and colleagues validated CMA as the preferred method for prenatal diagnostics through an NICHD consortium of 4,340 prenatal samples with both karyotyping and CMA (Wapner et al., 2012, N Engl J Med). Here, we propose to validate jumping library sequencing for routine SV detection in these well-characterized prenatal samples. We will perform a critical validation in Aim 1 to determine the sensitivity of our sequencing method to capture all pathogenic SVs detected by karyotyping and CMA, with the benefit of precise sequence resolution for gene discovery.
In Aim 2, we will calibrate the added diagnostic value of cryptic SVs in at least 400 trios from the highest yield diagnostic cohort (ultrasound defects) for which conventional methods failed to detect a causative mutation. Our preliminary data suggest that cryptic SVs account for 5.3-9.4% of pathogenic mutations; an important component of the diagnostic yield that is presently uncharacterized.
In Aim 3, we will integrate CMA data from a consortium of academic and commercial diagnostic sites with the exome aggregation project at the Broad Institute, which will collectively represent an amalgamation of >200,000 subjects. We will compare diagnostic yield from conventional criteria to a quantitative risk score based on the convergence of genomics datasets. The final product will be the validation of a single sequencing platform to overcome the limitations of two lower-resolution methods, the determination of clinical yield from a currently uncharacterized class of genomic variation, and the creation of a publicly accessible genome annotation resource. This project could have an immediate and transformative impact on genetic diagnostic practice.
This project will evaluate a new technology that will sequence the entire genome using a single test that can capture all of the structural changes in the chromosomes that currently requires two different genetic diagnostic tests. We will also define the clinical significance to prenatal diagnostics of a class of genetic variation (called `cryptic' structural variation) that is not currently detected by any diagnostic technology, and produce an annotation of the genes implicated in human disease by combining data from more than 200,000 subjects. At its conclusion, this study could transform current approaches to genetic testing for structural variation.