Regulatory sequences determine the level, location and timing of gene expression. These sequences are important in nearly all biological processes and many disease conditions. In many cases, the onset of cancer is likely related to changes in these regulatory sequences. This might involve single nucleotide changes that destroy or create motifs for transcription factor binding. In other cases, structural variants migh translocate a gene from one location to another, placing it under the wrong regulatory control region entirely. In still other cases integration of viral regulatory sequences into the promoter region of genes might drive expression of oncogenes. The proposed project will develop new tools to identify genes that have undergone a change in their regulatory sequences leading to cancer. Specifically, we will develop new software for the identification and prioritization of non coding mutations from whole genome sequence data. We will also develop experimental reagents in the form of a hybridization-based targeted-capture reagent to allow sequencing of prioritized regulatory regions when whole genome sequencing is either too expensive or is lacking coverage of the regions of interest. Genes found to have recurrently mutated regulatory regions could make suitable targets for therapeutic intervention as well as having prognostic and diagnostic value. In the long term, a better understanding of regulatory elements and gene expression patterns could help in the development of gene- based therapies that reduce the undesired side effects of conventional cancer therapies.

Public Health Relevance

Recent advances in high-throughput sequencing have allowed the comprehensive identification of DNA mutations in human cancer. Initial efforts at interpretation have focused almost entirely on protein-coding regions while the non-coding, regulatory sequences that control the timing and location of expression of these proteins has been largely overlooked. The research proposed here will address this knowledge gap by developing new technologies, knowledge bases and software to identify regulatory mutations driving progression of breast cancer and other solid tumors.

National Institute of Health (NIH)
National Cancer Institute (NCI)
Career Transition Award (K22)
Project #
Application #
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Jakowlew, Sonia B
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Washington University
Schools of Medicine
Saint Louis
United States
Zip Code
Wagner, Alex H; Coffman, Adam C; Ainscough, Benjamin J et al. (2016) DGIdb 2.0: mining clinically relevant drug-gene interactions. Nucleic Acids Res 44:D1036-44
Hundal, Jasreet; Carreno, Beatriz M; Petti, Allegra A et al. (2016) pVAC-Seq: A genome-guided in silico approach to identifying tumor neoantigens. Genome Med 8:11
Xin, Jiwen; Mark, Adam; Afrasiabi, Cyrus et al. (2016) High-performance web services for querying gene and variant annotation. Genome Biol 17:91
Griffith, Malachi; Griffith, Obi L; Krysiak, Kilannin et al. (2016) Comprehensive genomic analysis reveals FLT3 activation and a therapeutic strategy for a patient with relapsed adult B-lymphoblastic leukemia. Exp Hematol 44:603-13
Lesurf, Robert; Cotto, Kelsy C; Wang, Grace et al. (2016) ORegAnno 3.0: a community-driven resource for curated regulatory annotation. Nucleic Acids Res 44:D126-32
Griffith, Malachi; Walker, Jason R; Spies, Nicholas C et al. (2015) Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud. PLoS Comput Biol 11:e1004393
Kumar, Runjun D; Searleman, Adam C; Swamidass, S Joshua et al. (2015) Statistically identifying tumor suppressors and oncogenes from pan-cancer genome-sequencing data. Bioinformatics 31:3561-8
Griffith, Malachi; Miller, Christopher A; Griffith, Obi L et al. (2015) Optimizing cancer genome sequencing and analysis. Cell Syst 1:210-223