Structural variants (SVs) such as deletions, insertions, inversions, duplications, and translocations in cancer genomes can promote tumor progression by perturbing gene structures and expression. Additionally, extrachromosomal DNA (ecDNA)?an extreme form of SV found in a wide range of cancer types?are a reservoir of oncogene amplification and contribute to the genetic heterogeneity and evolution of tumors. Thus, a complete understanding of the structure and distribution of SVs and ecDNAs in tumors would shed light on their roles in tumor progression. However, the ability to detect and characterize SVs and ecDNAs at the molecular level has been limited by existing short-read sequencing approaches: large and complex SVs thwart efforts to detect them and correctly define their structures; and the multi-copy, heterogenous nature of ecDNAs undermines determination of their primary structures. While ecDNAs can be observed by DAPI-staining of metaphase tumor cells, determining their sequence content has typically relied on fluorescence in situ hybridization (FISH) to probe for candidate oncogenes. To support an unbiased and comprehensive molecular approach to the study of SVs, this project will develop and validate emerging genomic technologies that will enable the detection and characterization of complex SVs and ecDNAs as standard practices in cancer genomics.
In Aim 1, the read lengths of the nanopore single-molecule sequencing platform will be further extended by improving genomic DNA quality and optimizing library preparation reactions, with the goal of attaining N50 read lengths of 75-100 Kb. Such long read lengths are expected to span many SVs to more effectively reveal their molecular structures and phasing information. In parallel, the recent SV-detecting computational pipeline, Picky, will be optimized to detect molecular signatures of complex SVs and ecDNAs to allow their accurate and sensitive detection in long read sequencing data to >0.8 precision and recall rates. The active transcription of ecDNAs suggests that they are associated with RNA polymerase II transcription complexes, making them suitable for unsupervised detection by the chromatin interaction assay, ChIA-PET.
In Aim 2, this method will be employed to map ecDNAs via their association with RNA polymerase II and reveal transcriptionally relevant interactions between ecDNAs and the chromosomes. Computational methods will be developed to specifically detect ecDNA-amplified sequences in ChIA-PET data and their associated oncogenic genes. Additionally, ecDNAs uncovered by ChIA- PET will be targeted by the CRISPR/dCas9-based targeted capture method to physically isolate ecDNA molecules for long-read sequencing and structural characterization.
Aim 3 will build on the developed methods to generate a platform for unbiased and unsupervised characterization of SVs and ecDNAs in glioblastoma neurosphere cultures and in xenograft tumor models of glioblastoma, breast, and lung cancer. Taken together, this project will develop methods and tools that will empower the cancer research community to confidently and comprehensively detect SVs and ecDNAs in cancer genomes.

Public Health Relevance

The chromosomes of cancer cells often undergo aberrant structural changes that can promote tumor growth or interfere with the effectiveness of cancer treatment. However, identifying these changes and predicting how they will affect patient outcomes is challenging because current research technologies cannot detect these changes efficiently and comprehensively at the DNA level. This project will optimize new methods for both sequencing cancer cell DNA and analyzing the sequencing results with the specific goal of empowering the cancer research community to routinely detect--and study in much greater detail--the complicated structural changes and their tumorigenic roles in cancer cells.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Exploratory/Developmental Grants Phase II (R33)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Sorbara, Lynn R
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Jackson Laboratory
Bar Harbor
United States
Zip Code