Therapeutic risk factor modification has provided a significant decrease in coronary artery disease (CAD) in Western populations, however, significant risk is due to common inherited genetic variation that affects disease pathways in the vessel wall and remains poorly understood without specific therapies. To further our long-term goal of characterizing the molecular basis for this genetic risk, we have participated in genome-wide association studies (GWAS) identifying allelic variation linked to coronary artery disease (CAD) risk, and these efforts have yielded hundreds of associated loci. However, the majority of identified causal variation resides outside of protein coding exons, in regulatory regions of the genome that are poorly understood, and further efforts are required to understand the mechanisms of association and thus disease risk. Our central hypothesis is that an important subset of disease allelic variation primarily regulates long non-coding RNA (lncRNA) expression, with this effect modulating causal protein coding gene (pcGene) expression through functional genomic interactions such as chromosomal looping. Our objective here is to investigate the role these lncRNAs play in mediating expression of CAD causal pcGenes, and the mechanism by which they accomplish this function. Our rationale is that lncRNAs serve as a critical intermediary between genetic and epigenetic signaling, and that elucidating their mechanism of function is a key aspect of understanding CAD risk. To gain fundamental information regarding the mode of action of these molecules in the context of CAD, we propose to study human coronary artery smooth muscle cell (HCASMC) lncRNAs.
In Aim 1, we will identify lncRNAs regulated in these cells by disease-related stimuli and that map to CAD GWAS loci. Co-expression network analyses will connect these lncRNAs to pcGenes, and initiate network and pathway analyses to begin to establish their biological functional associations.
In Aim 2, we will map expression quantitative trait loci variants (eQTLs) for each of the lncRNAs, using a high-throughput allele-specific expression method that provides quantification of low abundance RNAs. Discovered lncRNA eQTLs will be investigated to determine whether they colocalize with CAD GWAS causal variation, as well as genomic molecular trait QTLs. CRISPR genome editing will be employed to validate the eQTLs, and confirm pcGene identity.
In Aim 3, we will employ CRISPR inhibition and single cell RNA sequencing (PerturbSeq) to map the transcriptional networks regulated by the disease related lncRNAs, and also investigate their in vitro cellular effects on HCASMC. These studies will be aided by our extensive work with primary cultured HCASMC characterizing epigenome modification, chromatin accessibility, and looping, and our efforts to map CAD GWAS causal variants and genes that mediate risk in this cell type. This work is highly innovative in that it combines unique genomic datasets developed in a highly disease relevant cell type and significant since it will integrate lncRNAs, their regulatory variation, and molecular mechanisms into the etiology of CAD risk.
) Significant expense and effort by groups of scientists around the world has led to identification of regions of the human genome that are associated with the genetic risk for various forms of cardiovascular disease, including coronary artery disease. Additional research is required to understand the specific genes involved, and how they work together to contribute to disease risk. Such information will allow the development of better risk assessment and therapeutics for vascular diseases such as coronary artery disease.