Large tumor exome sequencing projects have identified a very large number of mutations whose cancer relevance is not yet understood. To begin to address this need, our team has produced two web applications for high-throughput computational analysis of cancer mutations: the Cancer-Related Analysis of VAriants Toolkit (CRAVAT) and the Mutation Position Imaging Toolbox (MuPIT). CRAVAT accepts millions of mutations in a single batch upload and maps mutations from genomic coordinates to annotated transcripts and proteins. MuPIT currently accepts batch uploads of up to 2500 SNVs and maps from genomic coordinates onto X-ray crystal structures of proteins from Protein Data Bank (PDB). We propose to combine and harden CRAVAT and MuPIT into a single web application, in which we will substantially improve the tools, user interface, software infrastructure, integration with external data resources and tools used by the community, and support for protected data. The scope of all tools in the web application will be broadened to handle analysis of the full range of small-scale mutation consequence types found in cancer exomes. CRAVAT analysis identifies mutations most likely to have deleterious impact on protein function and those that are most likely to confer a selective advantage to cancer cells (drivers), using classifiers developed by our team. Classifier scores are supplemented with annotations, including population allele frequencies, previous occurrence in tumor tissue types, and gene functional categories, enabling filtering (e.g. removing polymorphisms) and prioritization. Gene-level annotation and scoring, by aggregation of classifier scores from mutations in a cohort is also provided. MuPIT maps mutations from genomic positions onto to protein structures and provides interactive viewing of mutations in the context of protein structure, and in relation to a variety of annotations. To enable prioritization of interesting mutations and genes, the application provides a preview describing each structure and all available annotations (e.g., binding sites, experimental mutagenesis results, polymorphic and disease- associated variants that have been previously documented). After selecting a PDB of interest, the user is led to an interactive visualization page. An enhanced Jmol applet displays all SNVs mapped onto the structure. Frequently, many SNVs in the input list can be mapped onto a single structure, revealing clustering patterns around key functional sites. Based only on word-of-mouth, since the debut of the two applications in August 2012, CRAVAT has been utilized by 129 unique users from 39 countries, and it has analyzed 1,136 submitted jobs, totaling 27.9 million mutations. MuPIT has been utilized by 242 unique users from 25 countries, with 720 submitted jobs. (Source: Google Analytics).

Public Health Relevance

The proposed work will harden and develop web applications for the cancer genomics community to interpret small-scale mutations in cancer exomes. They are designed to handle very large number of mutations and to provide analysis targeted at researchers who are not bioinformatics experts. The work will contribute to understanding of the genetic complexity and heterogeneity of tumors and assist in discovery of new approaches for cancer prognosis and treatments.

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Research Project--Cooperative Agreements (U01)
Project #
5U01CA180956-03
Application #
8910262
Study Section
Special Emphasis Panel (ZCA1)
Program Officer
Li, Jerry
Project Start
2013-09-17
Project End
2017-08-31
Budget Start
2015-09-01
Budget End
2017-08-31
Support Year
3
Fiscal Year
2015
Total Cost
Indirect Cost
Name
Johns Hopkins University
Department
Biostatistics & Other Math Sci
Type
Biomed Engr/Col Engr/Engr Sta
DUNS #
001910777
City
Baltimore
State
MD
Country
United States
Zip Code
21205
Masica, David L; Douville, Christopher; Tokheim, Collin et al. (2017) CRAVAT 4: Cancer-Related Analysis of Variants Toolkit. Cancer Res 77:e35-e38
Masica, David L; Karchin, Rachel (2016) Towards Increasing the Clinical Relevance of In Silico Methods to Predict Pathogenic Missense Variants. PLoS Comput Biol 12:e1004725
Tokheim, Collin J; Papadopoulos, Nickolas; Kinzler, Kenneth W et al. (2016) Evaluating the evaluation of cancer driver genes. Proc Natl Acad Sci U S A 113:14330-14335
Tokheim, Collin; Bhattacharya, Rohit; Niknafs, Noushin et al. (2016) Exome-Scale Discovery of Hotspot Mutation Regions in Human Cancer Using 3D Protein Structure. Cancer Res 76:3719-31
Douville, Christopher; Masica, David L; Stenson, Peter D et al. (2016) Assessing the Pathogenicity of Insertion and Deletion Variants with the Variant Effect Scoring Tool (VEST-Indel). Hum Mutat 37:28-35
Karchin, Rachel; Cline, Melissa S (2015) Human genetics special issue on computational molecular medicine. Hum Genet 134:455-7