Next-generation sequencing (NGS) is enabling the routine, systematic analysis of somatic aberrations that accumulate in cancer genomes. Many of the functional mutations are structural, involving the deletion, duplication, translocation, insertion, or inversion of nucleotide sequences. Detecting these structural variations is fundamentally challenging due to the enormous number of ways a cancer genome can be altered and the presence of widespread repeats that obstruct the accurate alignment of short reads. Moreover, structural complexities are often compounded by clonal heterogeneity, i.e., mixtures of cell populations that contain heterogeneous aberrations in a tumor specimen, which result in diverse structural and copy number profiles. These issues pose an unprecedented challenge to developing practically useful computational tools that can be used to identify the presence of a structural variant and elucidate its functional and clinical relevance. To fully harness the power of NGS and to facilitate advances toward personalized medicine, we propose to develop a set of novel computational tools for detecting structural variants in heterogeneous cancer genomes. Specifically, we plan to pursue the following aims: 1) Develop novel computational tools for sensitive breakpoint detection and assembly, 2) Develop a statistical framework to characterize structural variants in heterogeneous tumors, and 3) Examine our tools through large-scale experimental validation and distribute the tools through an open source. Our short-term goal is to boost the transformation of the staggering amount of polyclonal NGS data produced by cancer genome sequencing projects such as The Cancer Genome Atlas and by the International Cancer Genome Consortium, to improve our understanding of tumor evolution and identify variants of functional and clinical relevance. Our long-term goal is to develop algorithms and prototypes that are usable in clinical settings for personalized diagnosis and treatment.

Public Health Relevance

This proposed project will deliver a set of computational algorithms to measure the clonal and the structural complexity of data produced by next-generation genome and transcriptome sequencing of tumor cells. Acquiring these algorithms is imperative for personalized diagnosis and treatment.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Li, Jerry
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas MD Anderson Cancer Center
Biostatistics & Other Math Sci
Other Domestic Higher Education
United States
Zip Code
Chong, Zechen; Ruan, Jue; Gao, Min et al. (2017) novoBreak: local assembly for breakpoint detection in cancer genomes. Nat Methods 14:65-67
Wang, Zixing; Kim, Tae Beom; Peng, Bo et al. (2017) Sarcomatoid Renal Cell Carcinoma Has a Distinct Molecular Pathogenesis, Driver Mutation Profile, and Transcriptional Landscape. Clin Cancer Res 23:6686-6696
Zafar, Hamim; Tzen, Anthony; Navin, Nicholas et al. (2017) SiFit: inferring tumor trees from single-cell sequencing data under finite-sites models. Genome Biol 18:178
Mashl, R Jay; Scott, Adam D; Huang, Kuan-Lin et al. (2017) GenomeVIP: a cloud platform for genomic variant discovery and interpretation. Genome Res 27:1450-1459
Fan, Xian; Chaisson, Mark; Nakhleh, Luay et al. (2017) HySA: a Hybrid Structural variant Assembly approach using next-generation and single-molecule sequencing technologies. Genome Res 27:793-800
Wyczalkowski, Matthew A; Wylie, Kristine M; Cao, Song et al. (2017) BreakPoint Surveyor: a pipeline for structural variant visualization. Bioinformatics 33:3121-3122
Chen, Tenghui; Wang, Zixing; Zhou, Wanding et al. (2016) Hotspot mutations delineating diverse mutational signatures and biological utilities across cancer types. BMC Genomics 17 Suppl 2:394
Zafar, Hamim; Wang, Yong; Nakhleh, Luay et al. (2016) Monovar: single-nucleotide variant detection in single cells. Nat Methods 13:505-7
Raghav, Kanwal; Morris, Van; Tang, Chad et al. (2016) MET amplification in metastatic colorectal cancer: an acquired response to EGFR inhibition, not a de novo phenomenon. Oncotarget 7:54627-54631
Meric-Bernstam, F; Brusco, L; Daniels, M et al. (2016) Incidental germline variants in 1000 advanced cancers on a prospective somatic genomic profiling protocol. Ann Oncol 27:795-800

Showing the most recent 10 out of 24 publications