Large-scale cancer genomics projects such as The Cancer Genome Atlas (TCGA), International Cancer Genome Consortium (ICGC), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Pediatric Cancer Genome Project (PCGP) are producing a wealth of high throughput sequence data from a large number of cancer samples and their matched normals. These data hold great promise for understanding the genetic basis of cancer and also for the identification of germline susceptibility variants in cancer. Major advancements have been made to systematically catalog somatic variations in cancer genomes from these data sets. However, identifying and interpreting germline changes using data from these studies remains a significant challenge. The primary difficulty stems from 1) the lack of computational pipelines/tools to utilize tumor and normal sequencing data for simultaneous detection of somatic, germline, and LOH events at the nucleotide and chromosomal levels and 2) the lack of uniform bioinformatics analysis strategies for identifying and prioritizing deleterious candidate germline variants responsible for susceptibility. We will develop a computational pipeline for the identification and interpretation f germline alterations in cancer including single nucleotide variants, insertions and deletions (indels), copy number variations, and structural variants. This pipeline will be initially used to systematically analyze whole genome, exome, and RNA-sequencing data from over 5,000 cancer cases already generated by several major efforts and individual research groups and additional cases that will be made publicly available in the next several years. In silico predicte deleterious germline variants from these data will be used for statistical association analysis across groups stratified by age and cancer type to identify novel germline susceptibility variants, genes, and pathways involved in different cancer types. We will further investigate the potential interaction between germline susceptibility variants and somatic mutational landscape. Finally, both pipeline and results from this project will be made publically available, facilitating the analysis and interpretation by the research community of the ever- growing large-scale cancer sequencing data to better discover and understand germline susceptibility variants.
The promise of personalized therapy for cancer will only be realized when each individual's germline and tumor genetic code can be read and analyzed in the clinical setting. The software tools and analysis strategies described in this proposal will enable efficient and cost-effective discovery of genetic changes relevant to cancer using publicly available high throughput sequencing data, which will accelerate the overall understanding of genetic information and its application to human health.
|Marshall, A D; Bailey, C G; Champ, K et al. (2017) CTCF genetic alterations in endometrial carcinoma are pro-tumorigenic. Oncogene 36:4100-4110|
|Huang, Kuan-Lin; Li, Shunqiang; Mertins, Philipp et al. (2017) Proteogenomic integration reveals therapeutic targets in breast cancer xenografts. Nat Commun 8:14864|
|Mashl, R Jay; Scott, Adam D; Huang, Kuan-Lin et al. (2017) GenomeVIP: a cloud platform for genomic variant discovery and interpretation. Genome Res 27:1450-1459|
|Manda, K R; Tripathi, P; Hsi, A C et al. (2016) NFATc1 promotes prostate tumorigenesis and overcomes PTEN loss-induced senescence. Oncogene 35:3282-92|
|Jones, K B; Barrott, J J; Xie, M et al. (2016) The impact of chromosomal translocation locus and fusion oncogene coding sequence in synovial sarcomagenesis. Oncogene 35:5021-32|
|Niu, Beifang; Scott, Adam D; Sengupta, Sohini et al. (2016) Protein-structure-guided discovery of functional mutations across 19 cancer types. Nat Genet 48:827-37|
|Ye, Kai; Wang, Jiayin; Jayasinghe, Reyka et al. (2016) Systematic discovery of complex insertions and deletions in human cancers. Nat Med 22:97-104|
|Griffith, Malachi; Griffith, Obi L; Smith, Scott M et al. (2015) Genome Modeling System: A Knowledge Management Platform for Genomics. PLoS Comput Biol 11:e1004274|
|Wong, Terrence N; Ramsingh, Giridharan; Young, Andrew L et al. (2015) Role of TP53 mutations in the origin and evolution of therapy-related acute myeloid leukaemia. Nature 518:552-555|
|Leiserson, Mark D M; Vandin, Fabio; Wu, Hsin-Ta et al. (2015) Pan-cancer network analysis identifies combinations of rare somatic mutations across pathways and protein complexes. Nat Genet 47:106-14|
Showing the most recent 10 out of 32 publications