Regulatory sequences determine the level, location and timing of gene expression. These sequences are important in nearly all biological processes and many disease conditions. In many cases, the onset of cancer is likely related to changes in these regulatory sequences. This might involve single nucleotide changes that destroy or create motifs for transcription factor binding. In other cases, structural variants migh translocate a gene from one location to another, placing it under the wrong regulatory control region entirely. In still other cases integration of viral regulatory sequences into the promoter region of genes might drive expression of oncogenes. The proposed project will develop new tools to identify genes that have undergone a change in their regulatory sequences leading to cancer. Specifically, we will develop new software for the identification and prioritization of non coding mutations from whole genome sequence data. We will also develop experimental reagents in the form of a hybridization-based targeted-capture reagent to allow sequencing of prioritized regulatory regions when whole genome sequencing is either too expensive or is lacking coverage of the regions of interest. Genes found to have recurrently mutated regulatory regions could make suitable targets for therapeutic intervention as well as having prognostic and diagnostic value. In the long term, a better understanding of regulatory elements and gene expression patterns could help in the development of gene- based therapies that reduce the undesired side effects of conventional cancer therapies.
Recent advances in high-throughput sequencing have allowed the comprehensive identification of DNA mutations in human cancer. Initial efforts at interpretation have focused almost entirely on protein-coding regions while the non-coding, regulatory sequences that control the timing and location of expression of these proteins has been largely overlooked. The research proposed here will address this knowledge gap by developing new technologies, knowledge bases and software to identify regulatory mutations driving progression of breast cancer and other solid tumors.