eQTL Mega-analysis for Functional Assessment of Multi?enhancer Gene Regulation This proposal is in response to RFA HG-13-013 Interpreting Variation in Human Non-Coding Genomic Regions Using Computational Approaches and Experimental Assessment (R01). It utilizes statistical modeling to identify multiple regulatory variants per transcript genome-wide, validates their actual function by genome engineering, and establishes their relevance in the context of inflammation. We propose to combine two parallel approaches to identification of regulatory polymorphisms in a unique resource of 10,000 peripheral blood transcriptome profiles linked to whole genome genotypes. Multivariate regression will then be used to fine map the highest probability common variants, focusing on those that play a critical role in transcriptional regulation specifically inthe context of inflammatory autoimmune diseases. CRISPR/Cas9 mediated site specific genome engineering will be used to experimentally confirm the predictions on a moderate-throughput basis for autoimmune loci in a lymphoid cell line. The computational approach will apply h hierarchical sparse learning (structured SL) models, informed by empirical measures of linkage disequilibrium, also incorporating evolutionary probabilities and ENCODE functional annotations to predict which variants are most likely to influence transcript abundance. Extensive simulations will be used to define parameters influencing the sensitivity and specificity of multivariate regulatory polymorphism detection, while also reducing the regulatory target for each transcript to just a dozen variants. Since a major objective of the RFA is not just to prioritize regulatory variants, but also to establish their influence on organismal phenotypes, we will profile their association with transcript abundance in T-lymphocytes isolated from peripheral blood samples exposed for 24 hours to lipopolysaccharide (LPS) or the inflammatory cytokine TNF?. Peripheral blood contains most of the relevant immune cell types, and our expectation is that genetic effects are modified in disease by the inflammatory agents, some variants losing their effect, other novel variants arising. Furthermore, direct demonstration of regulatory functio will be obtained for a set of up to 150 inflammatory autoimmune disease genes already identified by GWAS, using genome engineering. Non-homologous end joining will be used to disrupt each candidate site in a screening step, using drop digital PCR to measure the impact of mutations on gene expression, and then homology-directed replacement will be used for allele-specific replacement, in a handful of cases generating all possible haplotypes to experimentally confirm the predicted joint effects in a common genetic background. The computational and experimental approaches are expected to be extensible to many common diseases, and all code will be made publically available in conjunction with the MEGA suite of software for evolutionary genome analysis.

Public Health Relevance

Complex disease arises in large part as a result of aberrant gene expression. This study uses advanced statistical methodologies to refine the identity of a small number of most likely variants that regulate expression of genes in the immune system, and to confirm their role in inflammatory autoimmune disease. Genome engineering technologies will then be used to experimentally confirm their function.

National Institute of Health (NIH)
National Human Genome Research Institute (NHGRI)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZHG1-HGR-M (J1))
Program Officer
Brooks, Lisa
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Georgia Institute of Technology
Schools of Arts and Sciences
United States
Zip Code
Patel, Ravi; Scheinfeldt, Laura B; Sanderford, Maxwell D et al. (2018) Adaptive Landscape of Protein Variation in Human Exomes. Mol Biol Evol 35:2015-2025
Zeng, Biao; Lloyd-Jones, Luke R; Holloway, Alexander et al. (2017) Constraints on eQTL Fine Mapping in the Presence of Multisite Local Regulation of Gene Expression. G3 (Bethesda) 7:2533-2544