Genetic association studies have been successful in identifying >1,000 genetic loci associated with complex disease traits in human populations. However, it remains a central challenge to interpret the vast amounts of data generated by GWAS studies towards an improved understanding of disease markers and, thus, mechanisms, which are critical for translating GWAS findings into genomic medicine applications enabling improvements in diagnostics, therapies, and outcomes. Recent efforts to incorporate prior biological information into GWAS analysis has greatly enhanced the interpretation of GWAS findings by providing biological frameworks for prioritizing associations, and for interpreting multiple associated loci within the contexts of biological networks and pathways. We recently demonstrated that position-specific evolutionary priors could be incorporated into analysis of GWAS results to prioritize variants that were more reproducible across studies. We propose to develop, investigate, and apply evolutionary informed integrative methods that embrace and leverage the genetic complexity of common disease. We hypothesize that position-specific evolutionary features can be incorporated into multiscale biological pathway and network analysis, and that evolutionary informed pathway and network analysis can be applied to existing GWAS and clinical data sets to identify mechanisms giving rise to complex disease phenotypes in populations and individuals. We propose to develop and evaluate these hypotheses through pursuit of the following specific aims: (1) Develop novel evolutionary-informed pathway and network analysis method for interpreting GWAS findings. (2) Apply novel methods to established GWAS and clinical data for T2D to elucidate disease mechanisms underlying the genetic architecture across populations. (3) Develop a public database and software tool to enable evolutionary informed network analysis of GWAS findings for the broader research community.

Public Health Relevance

Type 2 diabetes and other common diseases are characterized as having complex genetic architectures involving up to many hundreds or thousands of genetic factors. Genetic association studies are being performed to uncover these factors, but it remains challenging to use the result of these studies to learn more about the genetic basis of these diseases. We propose to develop and apply advanced evolutionary and integrative genomic methods to explore the existing genetic association data for type 2 diabetes and further elucidate the underlying genetic causes of disease.

National Institute of Health (NIH)
National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)
Research Project (R01)
Project #
Application #
Study Section
Genomics, Computational Biology and Technology Study Section (GCAT)
Program Officer
Mckeon, Catherine T
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Icahn School of Medicine at Mount Sinai
Schools of Medicine
New York
United States
Zip Code
Patel, Ravi; Scheinfeldt, Laura B; Sanderford, Maxwell D et al. (2018) Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 35:2015-2025
Hodos, Rachel; Zhang, Ping; Lee, Hao-Chih et al. (2018) Cell-specific prediction and application of drug-induced gene expression profiles. Pac Symp Biocomput 23:32-43
Smith, Milo R; Glicksberg, Benjamin S; Li, Li et al. (2018) Loss-of-function of neuroplasticity-related genes confers risk for human neurodevelopmental disorders. Pac Symp Biocomput 23:68-79
Shameer, Khader; Glicksberg, Benjamin S; Hodos, Rachel et al. (2018) Systematic analyses of drugs and disease indications in RepurposeDB reveal pharmacological, biological and epidemiological factors influencing drug repositioning. Brief Bioinform 19:656-678
Johnson, Kipp W; Glicksberg, Benjamin S; Hodos, Rachel A et al. (2018) Causal inference on electronic health records to assess blood pressure treatment targets: an application of the parametric g formula. Pac Symp Biocomput 23:180-191
Miotto, Riccardo; Wang, Fei; Wang, Shuang et al. (2018) Deep learning for healthcare: review, opportunities and challenges. Brief Bioinform 19:1236-1246
Smith, Milo R; Yevoo, Priscilla; Sadahiro, Masato et al. (2018) Integrative bioinformatics identifies postnatal lead (Pb) exposure disrupts developmental cortical plasticity. Sci Rep 8:16388
Glicksberg, Benjamin S; Miotto, Riccardo; Johnson, Kipp W et al. (2018) Automated disease cohort selection using word embeddings from Electronic Health Records. Pac Symp Biocomput 23:145-156
Lee, Hao-Chih; Kosoy, Roman; Becker, Christine E et al. (2017) Automated cell type discovery and classification through knowledge transfer. Bioinformatics 33:1689-1695
Wei, Chengguo; Li, Li; Menon, Madhav C et al. (2017) Genomic Analysis of Kidney Allograft Injury Identifies Hematopoietic Cell Kinase as a Key Driver of Renal Fibrosis. J Am Soc Nephrol 28:1385-1393

Showing the most recent 10 out of 40 publications