Cardiovascular diseases (CVD) are the leading causes of morbidity and mortality in the United States. Atherosclerotic cardiovascular disease (ASVD) is the primary mechanism for the development of CVD and is largely considered preventable by the Center for Disease Control and Prevention. Lipid-lowering therapy is the current mainstay of preventative treatment for ASCVD and guidelines for pharmacotherapy rely on the 2013 Pooled Cohort Equations (PCE) for estimating 10-year risk. While these equations have been validated at a population level they have significant shortcomings that impact real-world patient-level effectiveness. These include implementation (i.e. time and effort for clinicians to enter patient data into a phone or web-based calculator), therapy changing sensitivity to highly variable inputs (e.g single time point blood pressure), paradoxical risk estimation for some patient subgroups that are an artifact of linear modeling (e.g. women smokers), blunt treatment of race (i.e. separately derived equations for black patients), and poor calibration for modern cohorts (i.e. resulting in the overestimation of risk). This project will attempt to address these shortcomings. First, portable tools for analyzing electronic health records found within the Rhode Island Health Information Exchange (HIE) will be developed for the extraction of PCE risk factors to enable the automated calculation of ASCVD risk. PCE risk factor extraction permutations (e.g. last vs mean blood pressure) will be optimized and the equations will be calibrated for the population. Next, EHR-system agnostic tools for extracting additional risk factors available within the medical record including symptom development, social determinants of health, and family history will be developed. PCE and non-PCE risk factors will be used for artificial neural network and dynamic Bayesian network modeling of ASCVD risk phenotype clusters to augment PCE risk prediction. Finally, a single nucleotide polymorphism (SNP) genotype data derived ASCVD genetic risk score will be integrated with the HIE derived risk factors to demonstrate the potential clinical implications of implementing an omics-integrated learning healthcare system. The project will serve as foundational training for the principal investigator towards pursuing a career as a physician-scientist in the field of biomedical informatics. Hypothesis: Atherosclerotic cardiovascular disease risk estimation is central to current lipid-lowering therapy guidelines. This project will test the hypothesis that population-level data-driven methods will improve the accuracy of risk calculators.
Aim 1 : Determine the Predictive Performance of PCE Risk Factors Derived from Longitudinal HIE Data Aim 2: Define Population-Based ASCVD Risk Phenotype Clusters Aim 3: Demonstrate HIE-Omics-Integrated Learning Healthcare System with Direct-to-Consumer Sequencing

Public Health Relevance

Cardiovascular diseases (CVD) are the leading cause of death and the most significant contributor to disability-adjusted life years in the United States. Atherosclerotic cardiovascular disease (ASCVD) is the primary mechanism for the development of CVD and ten-year risk estimation is the foundation of current lipid-lowering therapy guidelines. The success of this project will improve the prediction of ASCVD and serve as a model for health information exchange derived risk estimation and population-guided preventative therapy recommendation.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Individual Predoctoral NRSA for M.D./Ph.D. Fellowships (ADAMHA) (F30)
Project #
Application #
Study Section
Special Emphasis Panel (ZLM1)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Brown University
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code