New methods and enhanced software for predicting functional SNPs

Sunyaev, Shamil

Abstract

The focus of genomics research is rapidly shifting from the accumulation of genetic variation data to the functional interpretation of allelic variant. Sequencing studies are becoming the standard approach in all areas of genetics, generating an unprecedented demand for computational methods to predict the functional effect of mutations. We continuously develop and maintain PolyPhen-2, a computational method for predicting the functional effect of missense mutations. PolyPhen-2 makes predictions based on comparative sequence analysis and analysis of protein structure. This method is being widely applied in diverse areas of genetics. In spite of the large user base and our continuing efforts to increase prediction accuracy, there is an ample room for improvement and a great need to improve accuracy of the method. Our recent studies on population genetics of deleterious alleles point to fundamental complexities in the analysis and prediction of deleterious variation. The improved understanding of these complexities, new types of training and validation data and algorithmic approaches position us to substantially improve the computational method and the software. We will also expand the utility of the method by addressing previously underserved needs. Gene discovery studies prioritize identified variants both at the gene and the variant level. The question currently addressed by PolyPhen-2 and other prediction methods is whether a given variant is likely to affect gene function. Equally important considerations are whether a gene that harbors this variant is a morbid gene and whether most missense changes in this gene or a domain are likely to have a functional impact. Deep population sequencing data together with catalogs of known disease variants can be used in concert with evolutionary and structural analyses to prioritize genes. Many large-scale sequencing projects are transitioning from exomes to whole genome sequencing. This opens a perspective for the analysis of non-coding variation. Non-coding variation has been shown to play a key role in genetics of polygenic complex phenotypes. However, the importance of large effect non-coding variants for phenotypes that segregate in the Mendelian fashion is unclear and still under debate. Through separately supported whole genome sequencing of cases of Mendelian diseases linked to known loci but lacking protein-coding variants we will select non-coding Mendelian mutations in an unbiased fashion, analyze the potential underlying biology and will develop a computational predictor. This approach is fundamentally different from existing efforts on the analysis of non-coding variation that predict conservation or a reduction in sequence diversity rather than directly ascertain the pathogenic effect.
In Specific Aim 1 we will make substantial improvements in computational methods for predicting the functional effect of mutations and incorporate these improvements into the PolyPhen software.
In Specific Aim 2 we will develop gene-based scores based on population and disease genetics data and integrate them with the variant-based predictions.
In Specific Aim 3 we will extend the prediction to non-coding variation.

Public Health Relevance

The focus of genomics research is rapidly shifting from the accumulation of genetic variation data to the functional interpretation of allelic variants. We wil make substantial improvements in computational methods to predict the functional effect of mutations and incorporate these improvements into the PolyPhen software. We will also develop gene- based scores and will extend the prediction methods to non-coding variation.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM078598-10
Application #: 9281738
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Ravichandran, Veerasamy

Project Start: 2007-05-01
Project End: 2018-04-30
Budget Start: 2017-05-01
Budget End: 2018-04-30
Support Year: 10
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: Brigham and Women's Hospital
Department
Type
DUNS #: 030811269

City: Boston
State: MA
Country: United States
Zip Code: 02115

Related projects


NIH 2017 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital
NIH 2016 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital
NIH 2015 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2014 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2013 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$353,118
NIH 2012 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2010 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$334,748
NIH 2009 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$332,589
NIH 2008 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$326,359
NIH 2007 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$326,139

Publications

Cassa, Christopher A; Jordan, Daniel M; Adzhubei, Ivan et al. (2018) A literature review at genome scale: improving clinical variant assessment. Genet Med 20:936-941

Haghighi, Alireza; Krier, Joel B; Toth-Petroczy, Agnes et al. (2018) An integrated clinical program and crowdsourcing strategy for genomic sequencing and Mendelian disease gene discovery. NPJ Genom Med 3:21

Sohail, Mashaal; Vakhrusheva, Olga A; Sul, Jae Hoon et al. (2017) Negative selection in humans and fruit flies involves synergistic epistasis. Science 356:539-542

Cassa, Christopher A; Weghorn, Donate; Balick, Daniel J et al. (2017) Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat Genet 49:806-810

Chun, Sung; Casparino, Alexandra; Patsopoulos, Nikolaos A et al. (2017) Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat Genet 49:600-605

Sul, Jae Hoon; Cade, Brian E; Cho, Michael H et al. (2016) Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. Am J Hum Genet 99:846-859

Savova, Virginia; Chun, Sung; Sohail, Mashaal et al. (2016) Genes with monoallelic expression contribute disproportionately to genetic diversity in humans. Nat Genet 48:231-237

Lenz, Tobias L; Spirin, Victor; Jordan, Daniel M et al. (2016) Excess of Deleterious Mutations around HLA Genes Reveals Evolutionary Cost of Balancing Selection. Mol Biol Evol 33:2555-64

Jordan, Daniel M; Frangakis, Stephan G; Golzio, Christelle et al. (2015) Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524:225-9

Balick, Daniel J; Do, Ron; Cassa, Christopher A et al. (2015) Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck. PLoS Genet 11:e1005436

Showing the most recent 10 out of 31 publications

Comments

Be the first to comment on Shamil Sunyaev's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: