New methods and enhanced software for predicting functional SNPs

Sunyaev, Shamil

Abstract

Rapid advances in DNA sequencing technology enabled massive identification and cataloging of human allelic variation in research and clinical setting. A key challenge for human genetics today is to identify, among the myriad of alleles, those variants that have an effect on molecular function and phenotypes. We earlier developed computational methods for predicting the functional effect of human mutations and non-synonymous SNPs and implemented these methods in software tools PolyPhen and subsequently PolyPhen-2. We maintain both online and standalone versions of these computational tools in our laboratory. These tools are widely used by geneticists in a variety of research and clinical applications. Explosion of large-scale population sequencing projects greatly increased demand for the prediction methods. These projects also set new requirements for significant improvements of the methods and for tailoring software to specific applications in new technologically advanced human genetics. Specifically, massive exome sequencing projects aiming at identifying genes that harbor rare coding variants involved in human phenotypes require highly accurate, easy to use and fast methods for annotating large numbers of sequence variants. On the other hand, DNA sequencing is rapidly becoming a method of choice in clinical genetic diagnostics. Interpretation of novel sequence variants in human disease genes becomes the major bottleneck in diagnostic analysis of sequencing data. Applications to clinical genetic diagnostics require substantial increase in the accuracy of prediction methods and development of methods that target specific protein groups and generate predictions specific to individual diagnostic tests. The current need in interpretation of sequence variants is paralleled by the opportunity to greatly enhance computational methods and software. Genomes of multiple vertebrates provide a rich resource of information for generating predictions. New statistical approaches are needed to optimally employ these data. Recent increase of the size of databases of human mutations and common SNPs provide much larger training and testing datasets. New methods should be developed to fully benefit from large training and testing data.
In Specific Aim 1, we will develop a prediction method guided by the phylogenetic tree that would utilize alignments of vertebrate genomes. We will further incorporate interactions between amino acid positions in the analysis of comparative genomics data to take into account compensatory substitutions.
In Specific Aim 2, we will develop a version of PolyPhen software for the analysis of exome or genome sequencing datasets. We will integrate functional predictions in the statistical tests to detect phenotypic association of rare non- synonymous variants.
In Specific Aim 3, in close collaboration with clinical geneticists we will test feasibility of developing prediction methods specialized for individual diagnostic tests that would achieve clinically useful levels of specificity and sensitivity.

Public Health Relevance

A key challenge for human genetics today is to identify, among the myriad of alleles discovered by massive DNA sequencing projects, genetic variants that have an effect on molecular function and human disease. We earlier developed widely used software for predicting the functional effect of human alleles. We plan to substantially increase the accuracy of the computational prediction method, adapt the method to the needs of large-scale sequencing projects and specific genetic diagnostic tests.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 5R01GM078598-08
Application #: 8917246
Study Section: Biodata Management and Analysis Study Section (BDMA)
Program Officer: Ravichandran, Veerasamy

Project Start: 2006-04-01
Project End: 2016-04-30
Budget Start: 2015-05-01
Budget End: 2016-04-30
Support Year: 8
Fiscal Year: 2015
Total Cost: $365,925
Indirect Cost: $160,925

Institution

Name: Brigham and Women's Hospital
Department
Type
DUNS #: 030811269

City: Boston
State: MA
Country: United States
Zip Code: 02115

Related projects


NIH 2017 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital
NIH 2016 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital
NIH 2015 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2014 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2013 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$353,118
NIH 2012 R01 GM	New methods and enhanced software for predicting functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$365,925
NIH 2010 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$334,748
NIH 2009 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$332,589
NIH 2008 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$326,359
NIH 2007 R01 GM	New Methods and Enhanced Software for Predicting Functional SNPs Sunyaev, Shamil / Brigham and Women's Hospital	$326,139

Publications

Cassa, Christopher A; Jordan, Daniel M; Adzhubei, Ivan et al. (2018) A literature review at genome scale: improving clinical variant assessment. Genet Med 20:936-941

Haghighi, Alireza; Krier, Joel B; Toth-Petroczy, Agnes et al. (2018) An integrated clinical program and crowdsourcing strategy for genomic sequencing and Mendelian disease gene discovery. NPJ Genom Med 3:21

Sohail, Mashaal; Vakhrusheva, Olga A; Sul, Jae Hoon et al. (2017) Negative selection in humans and fruit flies involves synergistic epistasis. Science 356:539-542

Cassa, Christopher A; Weghorn, Donate; Balick, Daniel J et al. (2017) Estimating the selective effects of heterozygous protein-truncating variants from human exome data. Nat Genet 49:806-810

Chun, Sung; Casparino, Alexandra; Patsopoulos, Nikolaos A et al. (2017) Limited statistical evidence for shared genetic effects of eQTLs and autoimmune-disease-associated loci in three major immune-cell types. Nat Genet 49:600-605

Sul, Jae Hoon; Cade, Brian E; Cho, Michael H et al. (2016) Increasing Generality and Power of Rare-Variant Tests by Utilizing Extended Pedigrees. Am J Hum Genet 99:846-859

Savova, Virginia; Chun, Sung; Sohail, Mashaal et al. (2016) Genes with monoallelic expression contribute disproportionately to genetic diversity in humans. Nat Genet 48:231-237

Lenz, Tobias L; Spirin, Victor; Jordan, Daniel M et al. (2016) Excess of Deleterious Mutations around HLA Genes Reveals Evolutionary Cost of Balancing Selection. Mol Biol Evol 33:2555-64

Jordan, Daniel M; Frangakis, Stephan G; Golzio, Christelle et al. (2015) Identification of cis-suppression of human disease mutations by comparative genomics. Nature 524:225-9

Balick, Daniel J; Do, Ron; Cassa, Christopher A et al. (2015) Dominance of Deleterious Alleles Controls the Response to a Population Bottleneck. PLoS Genet 11:e1005436

Showing the most recent 10 out of 31 publications

Comments

Be the first to comment on Shamil Sunyaev's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: