Single nucleotide polymorphisms (SNPs) comprise the majority of the genetic differences between human individuals. Non-synonymous coding SNPs (nsSNPs), which result in amino acid replacements in protein sequences, together with c/s-regulatory SNPs affecting transcription and splicing are thought collectively to account for much of the genetic component of individual variation in susceptibility to complex diseases, response to Pharmaceuticals, and other phenotypes. Identification of functional nsSNPs can be facilitated by computational predictions based on the analysis of protein multiple sequence alignments, 3D structures and sequence annotations. This analysis was earlier automated in the computer program PolyPhen, an online tool maintained in our laboratory. Numerous researchers in diverse fields currently use PolyPhen to predict the effect of nsSNPs on protein structure and function. However, there is an increasing need for more accurate computational approaches to improve such predictions and to expand applicability of PolyPhen to all classes of polymorphisms. This proposal focuses on improving methods to predict the functional effect of SNPs in the human genome incorporated in PolyPhen and on transforming PolyPhen into scalable user-friendly cross-platform software. The proposal targets three Specific Aims: First, we propose to improve accuracy of PolyPhen by introducing new computational strategies for prediction of the effect of nsSNPs on protein structure and function (Specific Aim 1). Methodological innovations will include development of a multiple sequence alignment pipeline suppressing false predictions arising from misalignments. A new method will eliminate false-negative predictions resulting from compensatory substitutions in homologous sequences. We will use a structurally optimized Bayesian classifier to predict the functional effect of nsSNPs based on multiple features derived from protein sequence and structure. Next, we propose to extend the prediction method to non-coding SNPs (Specific Aim 2). We plan to take advantage of the extensive comparative genomic data that have been and continue to be generated. We will introduce a computational approach to predict functional SNPs in non-coding regions on the basis of probabilistic evolutionary models Finally, we plan to incorporate these developments into a new version of the PolyPhen software system, which will address significant demand for a robust, cross-platform tool that can be easily applied by diverse investigators to the problem of functional analysis of human SNPs (Specific Aim 3). This new version of PolyPhen will be incorporated into the Clinical Research Chart developed by I2b2 National Center of Biomedical Computing and integrated with VISTA visualization tools.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM078598-04
Application #
7825415
Study Section
Special Emphasis Panel (ZRG1-BST-D (51))
Program Officer
Remington, Karin A
Project Start
2007-05-01
Project End
2012-04-30
Budget Start
2010-05-01
Budget End
2012-04-30
Support Year
4
Fiscal Year
2010
Total Cost
$334,748
Indirect Cost
Name
Brigham and Women's Hospital
Department
Type
DUNS #
030811269
City
Boston
State
MA
Country
United States
Zip Code
02115
Cassa, Christopher A; Jordan, Daniel M; Adzhubei, Ivan et al. (2018) A literature review at genome scale: improving clinical variant assessment. Genet Med 20:936-941
Haghighi, Alireza; Krier, Joel B; Toth-Petroczy, Agnes et al. (2018) An integrated clinical program and crowdsourcing strategy for genomic sequencing and Mendelian disease gene discovery. NPJ Genom Med 3:21
Sohail, Mashaal; Vakhrusheva, Olga A; Sul, Jae Hoon et al. (2017) Negative selection in humans and fruit flies involves synergistic epistasis. Science 356:539-542
Cassa, Christopher A; Weghorn, Donate; Balick, Daniel J et al. (2017) Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat Genet 49:806-810
Chun, Sung; Casparino, Alexandra; Patsopoulos, Nikolaos A et al. (2017) Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat Genet 49:600-605
Sul, Jae Hoon; Cade, Brian E; Cho, Michael H et al. (2016) Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. Am J Hum Genet 99:846-859
Savova, Virginia; Chun, Sung; Sohail, Mashaal et al. (2016) Genes with monoallelic expression contribute disproportionately to genetic diversity in humans. Nat Genet 48:231-237
Lenz, Tobias L; Spirin, Victor; Jordan, Daniel M et al. (2016) Excess of Deleterious Mutations around HLA Genes Reveals Evolutionary Cost of Balancing Selection. Mol Biol Evol 33:2555-64
Jordan, Daniel M; Frangakis, Stephan G; Golzio, Christelle et al. (2015) Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524:225-9
Balick, Daniel J; Do, Ron; Cassa, Christopher A et al. (2015) Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck. PLoS Genet 11:e1005436

Showing the most recent 10 out of 31 publications