Genome-wide association studies have identified common single nucleotide variants at over 160 genetic loci associated with coronary artery disease (CAD) and subclinical atherosclerosis (coronary artery calcification, carotid intima media thickness, and carotid plaque). These discoveries have led to important insights into the pathways that contribute to subclinical atherosclerosis and CAD, as well as insights into the genetic architecture of atherosclerosis. For example, the heritability explained by common genetic variants for CAD appears to be concentrated in regulatory regions. Nevertheless, neither the genome-wide association studies nor exome sequencing studies performed to date have been able to examine both coding and non-coding variants across the allele frequency spectrum in relation to subclinical atherosclerosis and CAD. Furthermore, these studies have largely focused on European ancestry participants. Approaches that identify pleiotropic loci or quantify genetic correlation among phenotypes exist, but have not yet been applied to subclinical atherosclerosis and CAD. Genetic risk prediction studies based on common variants show promise with regards to improving primary prevention, but the extent to which adding low-frequency and rare variants to polygenic risk scores improves risk prediction is not known, nor have scores been developed and tested in those of non-European ancestry. A wealth of whole-genome sequencing (WGS) data has been generated by initiatives such as the National Heart, Lung, and Blood Institute (NHLBI) Trans-Omics for Precision Medicine (TOPMed) program and the National Human Genome Research Institute (NHGRI) Centers for Common Disease Genomics (CCDG) program in populations from different ancestries. To expand our knowledge of genetic factors contributing to CAD and subclinical atherosclerosis phenotypes, we propose to use WGS data from TOPMed and CCDG (up to 101,295 individuals from diverse ancestries, of which 58% are non-European ancestry), with extended genomic coverage of low-frequency and rare genetic variants as well as more complex genetic variants such as structural variants. Findings from the WGS analysis will be replicated in several large-scale data sources, including exome sequencing data and genotype data imputed using TOPMed as the reference panel. Thus, we will examine genetic variation that has so far been missed, including structural variants. We will leverage the results of these analyses to explore the genetic architecture of subclinical atherosclerosis and CAD, investigate pleiotropy and genetic correlation between subclinical atherosclerosis and CAD and related cardiovascular traits, as well as assess the contribution of low-frequency and rare variants to risk prediction of CAD. Finally, we will create and test a polygenic risk score designed specifically for African ancestry population. This proposal brings together large-scale WGS datasets, clinical and subclinical atherosclerosis phenotypes, and exploits advances in genomic technologies and computational approaches. In doing so, we will advance the realization of precision medicine for CAD.

Public Health Relevance

Integrating whole-genome sequence information to study clinical and subclinical atherosclerosis will increase understanding of why individuals vary in susceptibility to coronary artery disease. This knowledge will open opportunities for precision medicine, improved risk prediction, as well as provide directions for future research.

National Institute of Health (NIH)
National Heart, Lung, and Blood Institute (NHLBI)
Research Project (R01)
Project #
Application #
Study Section
Cancer, Heart, and Sleep Epidemiology A Study Section (CHSA)
Program Officer
Papanicolaou, George
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Health Science Center Houston
Public Health & Prev Medicine
Schools of Public Health
United States
Zip Code