NHGRI U24: ATLAS OF REGULATORY VARIANTS IN DISEASE (ARVID) PROJECT SUMMARY Genome-wide association studies (GWAS) have identified thousands of single nucleotide polymorphisms (SNPs) linked to risk of developing specific non-cancerous polygenic diseases, including ischemic heart disease, chronic obstructive pulmonary disease, Alzheimer?s dementia, type 2 diabetes, and ischemic stroke. These disease-linked SNPs concentrate in regulatory DNA active in cell types that may mediate disease risk by modulating genes (eGenes) whose expression levels may be important in pathogenesis. These disease-linked expression SNPs (eSNPs) commonly alter transcription factor (TF) DNA binding motifs, indicating they may affect regulatory DNA activity by changing gene regulator binding. This U24 proposal aims to generate a genomic resource, the Atlas of Regulatory Variants in Disease (ARVID), containing the following 3 broad categories of information: 1) the individual disease-linked human eSNPs with differential gene regulatory function in relevant cell types 2) the target genes (eGenes) that these eSNPs dysregulate and 3) the gene regulators whose DNA association such disease eSNPs alter. First, we will identify the specific functionally altered eSNPs among those linked to index SNPs identified by GWAS in the 5 widespread human diseases noted above using massively parallel reporter assays (MPRA). A resulting subset of 300 top disease risk and non-risk eSNP pairs will then be deeply characterized in isogenic cells generated by gene editing to identify directly and indirectly dysregulated target genes. This effort will produce a Genomic Compendium of a) the disease-linked eSNPs that quantitatively impact regulatory DNA function in disease-relevant cell types and of b) the eGenes for the 300 top disease eSNPs. Second, we will identify the specific gene regulators whose DNA association is altered at the 300 disease risk eSNPs above, compared to matched non-risk alleles. To do this, we will use a live-cell proteomics approach termed DNA Protein Interaction Detection (DAPID). Quantitative mass spectrometry using isobaric tagging will be complemented by quantitative chromatin immunoprecipitation (ChIP) assays using isogenic, disease-relevant cells that differ only at the single eSNP nucleotide of interest. This effort will produce a Proteomic Atlas of differential regulator binding at 300 reference-disease eSNP pairs. This NHGRI U24 will generate a genomic resource defining the DNA variants, target genes, and gene regulators involved in inherited risk for 5 common non-cancerous polygenic human diseases.
ATLAS OF REGULATORY VARIANTS IN DISEASE (ARVID) Project Narrative Inherited risk for common diseases arises from multiple genetic variants, most of which reside in DNA sequences that may dysregulate gene expression in a way that predisposes to illness. An atlas of functionally validated disease variants, the genes they control, and the proteins whose binding they alter will help understand, predict, and prevent common diseases. 1