Mapping of cis-regulatory elements by ENCODE and Roadmap Epigenomics has led to increased recognition of the importance of non-coding regulatory regions, as the target for next great discovery of important somatic variants in childhood cancers. Recent discoveries of somatic non-coding mutations that cause oncogenic activation of TERT in melanoma and TAL1 in pediatric T-ALL support this idea, and have inspired genome-wide investigations of non-coding somatic mutations in cancer. However, the noncoding genome is vast and largely uncharted and it has proven difficult to distinguish the ?oncogenic drivers? from the ?passengers? among the many non-coding sequence variants. By combining our computational expertise in genomic analysis with the experimental expertise of our co-investigators Drs. Thomas Look and Suzanne Baker in molecular oncogenesis and Dr. Chunliang Li in novel assay development, we aim to discover new somatic non-coding variants that serve as drivers of pediatric cancer. This effort capitalizes on the strength of our work to-date on the pediatric cancer genome landscape, including non-coding regions, and the richness of our unique resource of ?omics? results from existing whole genome sequencing (WGS) and RNA-seq of >2,000 paired tumor/normal childhood cancer samples. Through a pilot study of 33 T-lineage acute lymphoblastic leukemia, we demonstrate that by an integrative approach we are able to successfully distinguish ?driver-mutations? from ?passengers? and discover novel variants in the non-coding genomes of childhood cancers.
In Aim 1, we will discover somatic alterations in non-coding regions that are associated with aberrant, allele-specific expression by analyzing WGS and RNA- seq data from 2,000 patient samples and from established cancer cell lines. We will focus on sequence alterations that form locus-specific transcription factor binding sites and employ a massively parallel reporter assay to measure the enhancer activity of the candidate non-coding mutations.
In Aim 2, we will develop a computational framework for predicting non-coding variant pathogenicity based on statistical analysis of patient data and mechanism studies underlying regulatory non-coding variants unveiled by laboratory investigation. We will discover abnormal enhancer-promoter interactions in pediatric cancer patient derived xenograft (PDX) models or cell lines using 3-D chromatin assays such as Capture-C. We will use ChIP-seq and RNA-seq to analyze the functional consequences of non-coding variants in PDXs or cancer cell lines. In parallel we will use CRISPR-Cas9 genome editing tools to modify or alter the mutant allele to further explore the regulation mechanisms.
In Aim 3, we will develop a web-based and user-friendly visualization tool to accelerate discovery of non-coding driver mutations by making the non-coding variants discovered in our study publicly accessible with an integrated genome-wide view of ?omics? datasets for use by the research community. The discovery of non-coding somatic ?driver? mutations in childhood cancers as a result of this collaborative effort will lead to major advances in our understanding of the molecular pathogenesis of childhood cancers and our computational predictive model will provide new insight into implementation of individualized ?precision medicine?.
The goal of our project is to discover somatic non-coding driver mutations in pediatric cancer by integrating innovative bioinformatics analysis of genomic sequencing data generated from 2,000 pediatric cancer patients with creative experimental approaches to document functional significance. Novel mechanisms and actionable oncogenes discovered by this approach will have broad implications for molecular pathogenesis and clinical management in pediatric cancers.