Rapid advances in DNA sequencing technology enabled massive identification and cataloging of human allelic variation in research and clinical setting. A key challenge for human genetics today is to identify, among the myriad of alleles, those variants that have an effect on molecular function and phenotypes. We earlier developed computational methods for predicting the functional effect of human mutations and non-synonymous SNPs and implemented these methods in software tools PolyPhen and subsequently PolyPhen-2. We maintain both online and standalone versions of these computational tools in our laboratory. These tools are widely used by geneticists in a variety of research and clinical applications. Explosion of large-scale population sequencing projects greatly increased demand for the prediction methods. These projects also set new requirements for significant improvements of the methods and for tailoring software to specific applications in new technologically advanced human genetics. Specifically, massive exome sequencing projects aiming at identifying genes that harbor rare coding variants involved in human phenotypes require highly accurate, easy to use and fast methods for annotating large numbers of sequence variants. On the other hand, DNA sequencing is rapidly becoming a method of choice in clinical genetic diagnostics. Interpretation of novel sequence variants in human disease genes becomes the major bottleneck in diagnostic analysis of sequencing data. Applications to clinical genetic diagnostics require substantial increase in the accuracy of prediction methods and development of methods that target specific protein groups and generate predictions specific to individual diagnostic tests. The current need in interpretation of sequence variants is paralleled by the opportunity to greatly enhance computational methods and software. Genomes of multiple vertebrates provide a rich resource of information for generating predictions. New statistical approaches are needed to optimally employ these data. Recent increase of the size of databases of human mutations and common SNPs provide much larger training and testing datasets. New methods should be developed to fully benefit from large training and testing data.
In Specific Aim 1, we will develop a prediction method guided by the phylogenetic tree that would utilize alignments of vertebrate genomes. We will further incorporate interactions between amino acid positions in the analysis of comparative genomics data to take into account compensatory substitutions.
In Specific Aim 2, we will develop a version of PolyPhen software for the analysis of exome or genome sequencing datasets. We will integrate functional predictions in the statistical tests to detect phenotypic association of rare non- synonymous variants.
In Specific Aim 3, in close collaboration with clinical geneticists we will test feasibility of developing prediction methods specialized for individual diagnostic tests that would achieve clinically useful levels of specificity and sensitivity!

Public Health Relevance

A key challenge for human genetics today is to identify, among the myriad of alleles discovered by massive DNA sequencing projects, genetic variants that have an effect on molecular function and human disease. We earlier developed widely used software for predicting the functional effect of human alleles. We plan to substantially increase the accuracy of the computational prediction method, adapt the method to the needs of large-scale sequencing projects and specific genetic diagnostic tests.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Eckstrand, Irene A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brigham and Women's Hospital
United States
Zip Code
Coste, Bertrand; Houge, Gunnar; Murray, Michael F et al. (2013) Gain-of-function mutations in the mechanically activated ion channel PIEZO2 cause a subtype of Distal Arthrogryposis. Proc Natl Acad Sci U S A 110:4667-72
Kiezun, Adam; Pulit, Sara L; Francioli, Laurent C et al. (2013) Deleterious alleles in the human genome are on average younger than neutral alleles of the same frequency. PLoS Genet 9:e1003301
Cassa, Christopher A; Tong, Mark Y; Jordan, Daniel M (2013) Large numbers of genetic variants considered to be pathogenic are common in asymptomatic individuals. Hum Mutat 34:1216-20
Adzhubei, Ivan; Jordan, Daniel M; Sunyaev, Shamil R (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet Chapter 7:Unit7.20
Jordan, Daniel M; Kiezun, Adam; Baxter, Samantha M et al. (2011) Development and validation of a computational method for assessment of missense variants in hypertrophic cardiomyopathy. Am J Hum Genet 88:183-92
Adzhubei, Ivan A; Schmidt, Steffen; Peshkin, Leonid et al. (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248-9
Price, Alkes L; Kryukov, Gregory V; de Bakker, Paul I W et al. (2010) Pooled association tests for rare variants in exon-resequencing studies. Am J Hum Genet 86:832-8
Jordan, Daniel M; Ramensky, Vasily E; Sunyaev, Shamil R (2010) Human allelic variation: perspective from protein function, structure, and evolution. Curr Opin Struct Biol 20:342-50
Stamatoyannopoulos, John A; Adzhubei, Ivan; Thurman, Robert E et al. (2009) Human mutation rate associated with DNA replication timing. Nat Genet 41:393-5
Kryukov, Gregory V; Shpunt, Alexander; Stamatoyannopoulos, John A et al. (2009) Power of deep, all-exon resequencing for discovery of human trait genes. Proc Natl Acad Sci U S A 106:3871-6

Showing the most recent 10 out of 12 publications