Alterations to genomes either drive or mark the processes underlying the pathogenesis of cancer. One such alteration, rearrangements, heavily impact the structure of cancer genomes. By the drastic nature of these mutations on cancer genomes, we expect these to be major drivers or signatures of mutational processes. Thanks to next-generation sequencing and computational/algorithmic advances, the field of cancer research has made leaps in understanding dynamics of tumor evolution, mutational signatures, and both genetic and epigenetic drivers of different cancer types. This information has even entered the clinical realm of precision medicine to effectively treat cancer cases. However, most of this progress has relied on identifying smaller variants such as single nucleotide variants (SNVs) or small insertions/deletions (indels), largely because, paradoxically, structural variants remain difficult to detect and characterize using current sequencing technology. With the availability of widely available sequenced primary tumor tissues from various cancer consortia, the use of algorithms to reliably identify and characterize structural variants, is not only a preference, but a necessity. Although many structural variant calling algorithms exist and are in development, most have not been benchmarked uniformly or even reliably. This poses a major problem in that rearrangements clearly have major effects and should bear strong mutational signatures, but are not precisely understood because of the lack of well-developed computational tools to characterize them. No gold standard dataset of structural variants exist for any cancer type, requiring an accurate simulation to properly benchmark and develop these algorithms. This proposal will fill this gap by creating an accurate and coherent simulation of structural variants in addition to small variants to benchmark the most current set of structural variant tools for detection, as well as characterization of tumor evolution.

Public Health Relevance

Thoroughly? ?characterizing? ?large? ?scale? ?genomic? ?rearrangements,? ?which? ?are? ?frequent,? ?but? ?under-examined,? ?in cancers? ?and? ?their? ?underlying? ?biological? ?mechanisms? ?hinges? ?on? ?the? ?use? ?of? ?computational? ?algorithms? ?with? ?the ever? ?expanding? ?resources? ?of? ?large? ?genomic? ?data? ?available? ?to? ?researchers.? ?Benchmarking? ?these? ?algorithms serves? ?as? ?a? ?crucial? ?first? ?step? ?to? ?confidently? ?use? ?computational? ?tools? ?to? ?detect? ?structural? ?variants? ?and? ?further describe? ?their? ?biological? ?significance.? ?Rigorous? ?benchmarks? ?will? ?lead? ?to? ?improved? ?algorithms? ?that? ?will? ?have major? ?implications? ?for? ?understanding? ?how? ?structural? ?variants? ?impact? ?oncogenesis,? ?cancer? ?evolution,? ?and? ?even patient? ?outcomes.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Predoctoral Individual National Research Service Award (F31)
Project #
1F31CA232465-01
Application #
9609669
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Radaev, Sergey
Project Start
2018-07-01
Project End
2021-06-30
Budget Start
2018-07-01
Budget End
2019-06-30
Support Year
1
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Weill Medical College of Cornell University
Department
Pathology
Type
Schools of Medicine
DUNS #
060217502
City
New York
State
NY
Country
United States
Zip Code
10065