Non-coding single nucleotide polymorhpisms (SNPs) account for over 85% of the genotype-phenotype associations identified in genomewide association studies (GWAS), yet we understand almost nothing about their functional mechanisms. Numerous lines of evidence demonstrate that regulatory SNPs play causal roles in many complex human phenotypes. GWAS associations are enriched for variants associated with gene expression levels (eQTLs) and within cis-regulatory elements (CREs). Because eQTLs and CREs are often functional in a subset of cell types, and because a particular cell type is often of interest for a disease, it is critical that analyses of GWAS-eQTL overlap consider cell specificity. Our long term research objective is to determine, for every non-coding SNP, if it is functional in a particular cell type and, if so, the specific mechanism by which it functions. In order to reach this goal, we need to have in hand a large set of cell specific, causal functional SNPs from which we can begin to generalize;the results from current eQTL studies are typically insufficient because they are not always relevant for a cell type of interest, they identify tag SNPs instead of the causal SNP, and they do not integrate CREs. Our objectives in this proposal are to develop statistical models to identify, quantify, and functionally interpret cell specific eQTLs in cis and trans, and to experimentally validate causal variant predictions using novel massively parallel CRE reporter assays.
In Aim 1, we will develop multivariate Bayesian regression models that will improve power for eQTL detection, improve the interpretibility of eQTL cell specificity, and identify the CREs through which each SNP functions.
In Aim 2, we will develop structured sparse latent factor models to identify cell specific gene coexpression modules that will be used to identify trans-eQTLs while simulataneously controlling for hidden confounding variables.
In Aim 3, we will develop and apply massively parallel CRE reporter assays to validate thousands of predicted causal variants that underlie eQTL associations. With such a large collection of cell specific causal eQTNs and CREs in hand, we hope to mechanistically interpret GWAS associations, identify cancer-causing somatic mutations, and specify novel drug targets for human disease.

Public Health Relevance

Genome-wide association studies have identified thousands of genetic variants that contribute to the risk of developing disease. However, we have not often translated these results into a mechanistic understanding of how diseases manifest, which is important for risk screening and drug design. To make this translation possible, we propose to identify causal variants that affect disease risk by regulating gene expression, the tissues in which they are active, and the genomic regulatory mechanisms that are disrupted by each variant.

National Institute of Health (NIH)
National Institute of Mental Health (NIMH)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-GGG-H (50))
Program Officer
Addington, Anjene M
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pennsylvania
Schools of Medicine
United States
Zip Code
Castel, Stephane E; Mohammadi, Pejman; Chung, Wendy K et al. (2016) Rare variant phasing and haplotypic expression from RNA sequencing with phASER. Nat Commun 7:12817
Long, Quan; Argmann, Carmen; Houten, Sander M et al. (2016) Inter-tissue coexpression network analysis reveals DPP4 as an important gene in heart to blood communication. Genome Med 8:15
Benitez-Buelga, Carlos; Vaclová, Tereza; Ferreira, Sofia et al. (2016) Molecular insights into the OGG1 gene, a cancer risk modifier in BRCA1 and BRCA2 mutations carriers. Oncotarget 7:25815-25
Hartmann, Katherine; Seweryn, Michał; Handleman, Samuel K et al. (2016) Non-linear interactions between candidate genes of myocardial infarction revealed in mRNA expression profiles. BMC Genomics 17:738
Hou, Liping; Bergen, Sarah E; Akula, Nirmala et al. (2016) Genome-wide association study of 40,000 individuals identifies two novel loci associated with bipolar disorder. Hum Mol Genet 25:3383-3394
Stacey, Simon N; Kehr, Birte; Gudmundsson, Julius et al. (2016) Insertion of an SVA-E retrotransposon into the CASP8 gene is associated with protection against prostate cancer. Hum Mol Genet 25:1008-18
Brinkmeyer-Langford, Candice L; Guan, Jinting; Ji, Guoli et al. (2016) Aging Shapes the Population-Mean and -Dispersion of Gene Expression in Human Brains. Front Aging Neurosci 8:183
Gordon, Erin D; Palandra, Joe; Wesolowska-Andersen, Agata et al. (2016) IL1RL1 asthma risk variants regulate airway type 2 inflammation. JCI Insight 1:e87871
Yang, Jialiang; Huang, Tao; Petralia, Francesca et al. (2015) Synchronized age-related gene expression changes across multiple tissues in human and the link to complex diseases. Sci Rep 5:15145
Guo, Cong; Ludvik, Anton E; Arlotto, Michelle E et al. (2015) Coordinated regulatory variation associated with gestational hyperglycaemia regulates expression of the novel hexokinase HKDC1. Nat Commun 6:6069

Showing the most recent 10 out of 13 publications