Genomic medicine offers hope for improved diagnostic methods and for more effective, patient-specific therapies. Genome-wide associated studies (GWAS) elucidate genetic markers that improve clinical understanding of risks and mechanisms for many diseases and conditions and that may ultimately guide diagnosis and therapy on a patient-specific basis. This project will expand on existing work to identify gene-phenotype associations across the genome and phenome, deploying new phenome-wide associations study (PheWAS) methods to deeply investigate electronic medical record (EMR)-derived phenotypes across common and rare variants across the genome. The project is enabled by large DNA biobanks coupled to de-identified copies of EMR. This project has three specific aims. First, we will expand the PheWAS phenotype library to include both binary traits and continuous variables incorporating about 7000 phenotypes derived from natural language processing, laboratory data, and report data.
The second aim i s to perform a PheWAS for common and rare variants using extant genome-wide and exome variant data and the broader set of phenotypes derived in Aim 1. We will analyze associations using single variant and multi-variant aggregation methods. We will validate the efficacy of our methods in Aim 2 by comparing to known associations.
The third aim i s to develop a standards-based infrastructure to share PheWAS results and develop tools to enable others to perform PheWAS. The tools generated from this project will not only expand the capabilities of the current PheWAS methodology, but will also broadly enable clinical research and subsequent genetic studies.

Public Health Relevance

Genomic medicine offers hope for improved diagnosis and for more effective, patient- specific therapies. This PheWAS proposal will develop new methods to identify detailed phenotypes and diseases from electronic medical records and then find novel genetic associations from existing genomic data.

National Institute of Health (NIH)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Rasmussen, Luke V; Thompson, Will K; Pacheco, Jennifer A et al. (2014) Design patterns for the development of electronic health record-driven phenotype extraction algorithms. J Biomed Inform 51:280-6
Shameer, Khader; Denny, Joshua C; Ding, Keyue et al. (2014) A genome- and phenome-wide association study to identify genetic variants influencing platelet count and volume and their pleiotropic effects. Hum Genet 133:95-109
Heatherly, Raymond; Denny, Joshua C; Haines, Jonathan L et al. (2014) Size matters: how population size influences genotype-phenotype association studies in anonymized data. J Biomed Inform 52:243-50
Carroll, Robert J; Bastarache, Lisa; Denny, Joshua C (2014) R PheWAS: data analysis and plotting tools for phenome-wide association studies in the R environment. Bioinformatics 30:2375-6
Ritchie, Marylyn D; Denny, Joshua C; Zuvich, Rebecca L et al. (2013) Genome- and phenome-wide analyses of cardiac conduction identifies markers of arrhythmia risk. Circulation 127:1377-85
Lasko, Thomas A; Denny, Joshua C; Levy, Mia A (2013) Computational phenotype discovery using unsupervised feature learning over noisy, sparse, and irregular clinical data. PLoS One 8:e66341
Heatherly, Raymond D; Loukides, Grigorios; Denny, Joshua C et al. (2013) Enabling genomic-phenomic association discovery without sacrificing anonymity. PLoS One 8:e53875
Wei, Wei-Qi; Cronin, Robert M; Xu, Hua et al. (2013) Development and evaluation of an ensemble resource linking medications to their indications. J Am Med Inform Assoc 20:954-61