Genomic medicine offers hope for improved diagnostic methods and for more effective, patient-specific therapies. Genome-wide associated studies (GWAS) elucidate genetic markers that improve understanding of risks and causes for many diseases, and may guide diagnosis and therapy on a patient-specific basis. This project will take another approach to identify gene-disease associations: perform """"""""reverse GWAS,"""""""" or phenome- wide association study (PheWAS), to determine which phenotypes are associated with a given genotype. The project is enabled by a large DNA biobank coupled to a de- identified copy of the electronic medical record. This project has four specific aims. First, the project will develop and validate a standardized approach to extract disease phenotypes from EMR records, integrating national standard terminologies of clinical disorders and descriptors relating to treatment and diagnosis of each disease to create a sharable knowledge base. The project will use natural language processing, structured data queries, and heuristic and machine learning methods to accurately identify patients with each disease and corresponding controls.
The second aim i s to perform PheWAS analyses using existing genotype data. To validate the method, the project will use PheWAS to """"""""rediscover"""""""" SNPs with known disease associations. The project will also investigate statistical methods for large-scale multiple hypothesis testing to discover novel phenotype associations.
The third aim i s to apply the PheWAS algorithms in four other sites with EMR-linked DNA biobanks and compare results. In the fourth aim, the project will validate novel phenotype-genotype associations discovered through PheWAS with new genotyping in a previously untested population. The tools generated from this project will not only make PheWAS possible, but will also broadly enable clinical research and subsequent genetic studies.

Public Health Relevance

The promise of genomic medicine is to predict individuals'disease risk and treatment given their genetic information. This project will develop methods to identify diseases from electronic medical records and then find novel genetic associations from existing genomic data.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Rhoades, Seth D; Bastarache, Lisa; Denny, Joshua C et al. (2018) Pulling the covers in electronic health records for an association study with self-reported sleep behaviors. Chronobiol Int 35:1702-1712
Barnado, April; Carroll, Robert J; Casey, Carolyn et al. (2018) Phenome-wide association study identifies marked increased in burden of comorbidities in African Americans with systemic lupus erythematosus. Arthritis Res Ther 20:69
Bastarache, Lisa; Hughey, Jacob J; Hebbring, Scott et al. (2018) Phenotype risk scores identify patients with unrecognized Mendelian disease patterns. Science 359:1233-1239
Robinson, Jamie R; Denny, Joshua C; Roden, Dan M et al. (2018) Genome-wide and Phenome-wide Approaches to Understand Variable Drug Actions in Electronic Health Records. Clin Transl Sci 11:112-122
Zhao, Junfei; Cheng, Feixiong; Jia, Peilin et al. (2018) An integrative functional genomics framework for effective identification of novel regulatory variants in genome-phenome studies. Genome Med 10:7
Mosley, Jonathan D; Feng, QiPing; Wells, Quinn S et al. (2018) A study paradigm integrating prospective epidemiologic cohorts and electronic health records to identify disease biomarkers. Nat Commun 9:3522
Bloodworth, Melissa H; Rusznak, Mark; Bastarache, Lisa et al. (2018) Association of ST2 polymorphisms with atopy, asthma, and leukemia. J Allergy Clin Immunol 142:991-993.e3
Denny, Joshua C; Van Driest, Sara L; Wei, Wei-Qi et al. (2018) The Influence of Big (Clinical) Data and Genomics on Precision Medicine and Drug Development. Clin Pharmacol Ther 103:409-418
Dahir, Kathryn M; Tilden, Daniel R; Warner, Jeremy L et al. (2018) Rare Variants in the Gene ALPL That Cause Hypophosphatasia Are Strongly Associated With Ovarian and Uterine Disorders. J Clin Endocrinol Metab 103:2234-2243
Barnado, A; Carroll, R J; Casey, C et al. (2018) Phenome-wide association study identifies dsDNA as a driver of major organ involvement in systemic lupus erythematosus. Lupus :961203318815577

Showing the most recent 10 out of 76 publications