Informatic profiling of clinically relevant mutation Many approaches have been developed to predict whether a mutation associated with a disease is actually causative. In contrast to other approaches to predicting deleterious mutations, our approach, called in silico functional profiling, starts with learning residue-specific protein function and then estimates when it is disrupted. This research will continue our efforts to characterize what the underlying molecular disruption a mutation is causing and thereby improve accuracy of these approaches. We will do this by building a database that links clinical observation with molecular phenotype and using this to develop bioinformatic models of mutation. This is particularly relevant to cancer, since many mutations in cancer are both poorly understood and simply associated with cancer. The hypothesis of this proposal is that computational methods that predict a specific residue function using protein sequence and structure can classify known disease-associated mutations based on their function better than existing computational methods, and less expensively than experimental assays. In short, we will describe each phenotypically annotated mutation as possibly affecting catalysis, protein interactions, posttranslational modification and stability of the protein. The structural environments around disease associated mutations can be characterized using a combination of computational biochemical methods based on first principles of biomolecular structure and function and statistical informatics methods. We will continue this research by implementing the following steps: First, we will build a database of how often mutations in cancer, pharmacogenetics, Mendelian and complex disease are disrupted by phosphorylation, stability, catalysis, protein interaction and other posttranslational modifications. Second, we will build a bioinformatic model of disruption using machine learning methods trained with these and other commonly used features. Finally, we will link these to clinical observation by annotating disease causing mutation with an ontology of diseases and integrate these predictions into databases of mutation. Thus, we will link clinical observation with molecular phenotypes by building a useful database and new models of how mutations cause disease.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Buck Institute for Age Research
United States
Zip Code
Cirincione, Ann G; Clark, Kaylyn L; Kann, Maricel G (2018) Pathway networks generated from human disease phenome. BMC Med Genomics 11:75
Peterson, Thomas A; Gauran, Iris Ivy M; Park, Junyong et al. (2017) Oncodomains: A protein domain-centric framework for analyzing rare variants in tumor samples. PLoS Comput Biol 13:e1005428
Cai, Binghuang; Li, Biao; Kiga, Nikki et al. (2017) Matching phenotypes to whole genomes: Lessons learned from four iterations of the personal genome project community challenges. Hum Mutat 38:1266-1276
Pejaver, Vikas; Mooney, Sean D; Radivojac, Predrag (2017) Missense variant pathogenicity predictors generalize well across a range of function-specific prediction challenges. Hum Mutat 38:1092-1108
Lugo-Martinez, Jose; Pejaver, Vikas; Pagel, Kymberleigh A et al. (2016) The Loss and Gain of Functional Amino Acid Residues Is a Common Mechanism Causing Human Inherited Disease. PLoS Comput Biol 12:e1005091
Ioannidis, Nilah M; Rothstein, Joseph H; Pejaver, Vikas et al. (2016) REVEL: An Ensemble Method for Predicting the Pathogenicity of Rare Missense Variants. Am J Hum Genet 99:877-885
Jiang, Yuxiang; Oron, Tal Ronnen; Clark, Wyatt T et al. (2016) An expanded evaluation of protein function prediction methods shows an improvement in accuracy. Genome Biol 17:184
Peterson, Thomas A; Mort, Matthew; Cooper, David N et al. (2016) Regulatory Single-Nucleotide Variant Predictor Increases Predictive Performance of Functional Regulatory Variants. Hum Mutat 37:1137-1143
Katzman, Shana M; Strotmeyer, Elsa S; Nalls, Michael A et al. (2015) Mitochondrial DNA Sequence Variation Associated With Peripheral Nerve Function in the Elderly. J Gerontol A Biol Sci Med Sci 70:1400-8
Friedberg, Iddo; Wass, Mark N; Mooney, Sean D et al. (2015) Ten simple rules for a community computational challenge. PLoS Comput Biol 11:e1004150

Showing the most recent 10 out of 59 publications