Non-coding single nucleotide polymorhpisms (SNPs) account for over 85% of the genotype-phenotype associations identified in genomewide association studies (GWAS), yet we understand almost nothing about their functional mechanisms. Numerous lines of evidence demonstrate that regulatory SNPs play causal roles in many complex human phenotypes. GWAS associations are enriched for variants associated with gene expression levels (eQTLs) and within cis-regulatory elements (CREs). Because eQTLs and CREs are often functional in a subset of cell types, and because a particular cell type is often of interest for a disease, it is critical that analyses of GWAS-eQTL overlap consider cell specificity. Our long term research objective is to determine, for every non-coding SNP, if it is functional in a particular cell type and, if so, the specific mechanism by which it functions. In order to reach this goal, we need to have in hand a large set of cell specific, causal functional SNPs from which we can begin to generalize; the results from current eQTL studies are typically insufficient because they are not always relevant for a cell type of interest, they identify tag SNPs instead of the causal SNP, and they do not integrate CREs. Our objectives in this proposal are to develop statistical models to identify, quantify, and functionally interpret cell specific eQTLs in cis and trans, and to experimentally validate causal variant predictions using novel massively parallel CRE reporter assays.
In Aim 1, we will develop multivariate Bayesian regression models that will improve power for eQTL detection, improve the interpretibility of eQTL cell specificity, and identify the CREs through which each SNP functions.
In Aim 2, we will develop structured sparse latent factor models to identify cell specific gene coexpression modules that will be used to identify trans-eQTLs while simulataneously controlling for hidden confounding variables.
In Aim 3, we will develop and apply massively parallel CRE reporter assays to validate thousands of predicted causal variants that underlie eQTL associations. With such a large collection of cell specific causal eQTNs and CREs in hand, we hope to mechanistically interpret GWAS associations, identify cancer-causing somatic mutations, and specify novel drug targets for human disease.

Public Health Relevance

Genome-wide association studies have identified thousands of genetic variants that contribute to the risk of developing disease. However, we have not often translated these results into a mechanistic understanding of how diseases manifest, which is important for risk screening and drug design. To make this translation possible, we propose to identify causal variants that affect disease risk by regulating gene expression, the tissues in which they are active, and the genomic regulatory mechanisms that are disrupted by each variant.

Agency
National Institute of Health (NIH)
Institute
National Institute of Mental Health (NIMH)
Type
Research Project (R01)
Project #
3R01MH101822-03S1
Application #
9266550
Study Section
Program Officer
Addington, Anjene M
Project Start
2016-07-01
Project End
2017-06-30
Budget Start
2016-07-01
Budget End
2017-06-30
Support Year
3
Fiscal Year
2016
Total Cost
$305,036
Indirect Cost
$57,950
Name
University of Pennsylvania
Department
Genetics
Type
Schools of Medicine
DUNS #
042250712
City
Philadelphia
State
PA
Country
United States
Zip Code
19104
Zhang, Mingfeng; Lykke-Andersen, Soren; Zhu, Bin et al. (2018) Characterising cis-regulatory variation in the transcriptome of histologically normal and tumour-derived pancreatic tissues. Gut 67:521-533
McDowell, Ian C; Barrera, Alejandro; D'Ippolito, Anthony M et al. (2018) Glucocorticoid receptor recruits to enhancers and drives activation by motif-directed binding. Genome Res 28:1272-1284
Agrawal, A; Chou, Y-L; Carey, C E et al. (2018) Genome-wide association study identifies a novel locus for cannabis dependence. Mol Psychiatry 23:1293-1302
Dolan, M Eileen; El Charif, Omar; Wheeler, Heather E et al. (2017) Clinical and Genome-Wide Analysis of Cisplatin-Induced Peripheral Neuropathy in Survivors of Adult-Onset Cancer. Clin Cancer Res 23:5757-5768
Varma, V R; Varma, S; An, Y et al. (2017) Alpha-2 macroglobulin in Alzheimer's disease: a marker of neuronal injury through the RCAN1 pathway. Mol Psychiatry 22:13-23
Mohammadi, Pejman; Castel, Stephane E; Brown, Andrew A et al. (2017) Quantifying the regulatory effect size of cis-acting genetic variation using allelic fold change. Genome Res 27:1872-1884
GTEx Consortium; Laboratory, Data Analysis &Coordinating Center (LDACC)—Analysis Working Group; Statistical Methods groups—Analysis Working Group et al. (2017) Genetic effects on gene expression across human tissues. Nature 550:204-213
Peckham-Gregory, Erin C; Chakraborty, Rikhia; Scheurer, Michael E et al. (2017) A genome-wide association study of LCH identifies a variant in SMAD6 associated with susceptibility. Blood 130:2229-2232
Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H et al. (2017) Co-expression networks reveal the tissue-specific regulation of transcription and splicing. Genome Res 27:1843-1858
Mercader, Josep M; Liao, Rachel G; Bell, Avery D et al. (2017) A Loss-of-Function Splice Acceptor Variant in IGF2 Is Protective for Type 2 Diabetes. Diabetes 66:2903-2914

Showing the most recent 10 out of 47 publications