Statistical Models to Investigate Long-Distance Qtl Transcription Regulation

Engelhardt, Barbara

Abstract

Thousands of genome-wide association studies link speci c diseases or complex phenotypes to singlemutations in the human genome. But translating these results to medical treatments requires aprecise understanding of how that mutation contributes to the mechanism of disease. Currently,the regulatory role of single nucleotide polymorphisms (SNPs) is, for the most part, con ned tolocal, or cis-, expression quantitative trait loci (eQTLs) in a small number of human tissues. Butnot all diseases or complex phenotypes are mediated by cis-eQTLs. Very few long-distance, ortrans-, eQTLs have been identi ed and validated in human tissues, although trans-eQTLs play animportant role in some complex phenotypes. Alternative splicing has also been shown to modulatecertain phenotypes; however, little is known about SNPs that regulate alternative splicing. Theproposed K99/R00 research seeks to design statistical methods that build gene andtranscript networks to identify SNPs that regulate gene and mRNA isoform tran-scription, both locally and over long distances, and to validate those ndings, for thepurpose of providing insight into mechanisms for complex phenotypes and disease. We propose to leverage cis-eQTLs and gene expression data in humans identi ed in our currentwork to build precise, directed gene networks on a genome-scale. We will build these networks usingBayesian statistical models to compute the probability of a particular network with respect to eachgene in the network jointly, with associated eQTLs providing information about whether regulatedgenes are upstream or downstream of other network genes. We will use Markov chain Monte Carloand linear programming relaxation methods that have been shown to nd near-optimal solutionsto this type of problem. We will use these networks to identify trans-eQTLs, and quantify thee ect of each trans-eQTL in a particular process using Bayesian statistical tests developed in ourlab. Subsequently, we propose to exploit the opportunities of novel RNA sequencing techniquesand nonparametric statistical models to identify transcript isoforms for each transcribed gene and,simultaneously, individual-speci c transcript levels by extending sparse factor analysis models.This will enable us to identify QTLs that regulate the transcription of speci c transcript isoforms(tQTLs) via alternative splicing events by extending the methods we have for eQTL identi cation.We will use the methodology we developed for eQTLs to build networks for transcript isoforms(transcript networks ). Finally, we will use transcript networks to identify and quantify tQTLs thatregulate individual-speci c levels of transcript isoforms both locally and over long genetic distances,as with eQTLs. We will make all of our methods and results publicly available.

Public Health Relevance

Thousands of genome-wide association studies link speci c diseases or complex traits to singlemutations in the human genome; but these results cannot yet be translated to medical treatmentsbecause knowing that a mutation is associated with a disease does not; in fact; give us insight intohow that mutation contributes to the mechanism of disease. Our proposed research will design andvalidate statistical methods that provide a comprehensive road map to understanding the biologicalrole of the mutations that are identi ed in these association studies. With the role of thousands ofpossibly disease-related mutations in hand; researchers can begin to piece together the mechanismof a disease and translate their ndings into treatments for the disease much more quickly.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Transition Award (R00)
Project #: 7R00HG006265-05
Application #: 9064281
Study Section: Special Emphasis Panel (NSS)
Program Officer: Volpi, Simona

Project Start: 2011-09-06
Project End: 2016-06-30
Budget Start: 2014-09-01
Budget End: 2016-06-30
Support Year: 5
Fiscal Year: 2014
Total Cost: $247,199
Indirect Cost: $16,124

Institution

Name: Princeton University
Department: Biostatistics & Other Math Sci
Type: Schools of Engineering
DUNS #: 002484665

City: Princeton
State: NJ
Country: United States
Zip Code: 08543

Related projects


NIH 2014 R00 HG	Statistical models to investigate long-distance QTL transcription regulation Engelhardt, Barbara Elizabeth / Duke University
NIH 2014 R00 HG	Statistical Models to Investigate Long-Distance Qtl Transcription Regulation Engelhardt, Barbara Elizabeth / Princeton University	$247,199
NIH 2013 R00 HG	Statistical models to investigate long-distance QTL transcription regulation Engelhardt, Barbara Elizabeth / Duke University	$249,000
NIH 2012 R00 HG	Statistical models to investigate long-distance QTL transcription regulation Engelhardt, Barbara Elizabeth / Duke University	$249,000

Publications

McDowell, Ian C; Barrera, Alejandro; D'Ippolito, Anthony M et al. (2018) Glucocorticoid receptor recruits to enhancers and drives activation by motif-directed binding. Genome Res 28:1272-1284

Saha, Ashis; Kim, Yungil; Gewirtz, Ariel D H et al. (2017) Co-expression networks reveal the tissue-specific regulation of transcription and splicing. Genome Res 27:1843-1858

Tonner, Peter D; Darnell, Cynthia L; Engelhardt, Barbara E et al. (2017) Detecting differential growth of microbial populations with Gaussian process regression. Genome Res 27:320-333

Gao, Chuan; McDowell, Ian C; Zhao, Shiwen et al. (2016) Context Specific and Differential Gene Co-expression Networks via Bayesian Biclustering. PLoS Comput Biol 12:e1004791

van den Berg, Stéphanie M; de Moor, Marleen H M; Verweij, Karin J H et al. (2016) Meta-analysis of Genome-Wide Association Studies for Extraversion: Findings from the Genetics of Personality Consortium. Behav Genet 46:170-82

Zhang, Weiwei; Spector, Tim D; Deloukas, Panos et al. (2015) Predicting genome-wide DNA methylation using methylation marks, genomic position, and DNA regulatory elements. Genome Biol 16:14

Genetics of Personality Consortium; de Moor, Marleen H M; van den Berg, Stéphanie M et al. (2015) Meta-analysis of Genome-wide Association Studies for Neuroticism, and the Polygenic Association With Major Depressive Disorder. JAMA Psychiatry 72:642-50

Mimno, David; Blei, David M; Engelhardt, Barbara E (2015) Posterior predictive checks to quantify lack-of-fit in admixture models of latent population structure. Proc Natl Acad Sci U S A 112:E3441-50

Hart, Amy B; Gamazon, Eric R; Engelhardt, Barbara E et al. (2014) Genetic variation associated with euphorigenic effects of d-amphetamine is associated with diminished risk for schizophrenia and attention deficit hyperactivity disorder. Proc Natl Acad Sci U S A 111:5968-73

Muratore, Kathryn E; Engelhardt, Barbara E; Srouji, John R et al. (2013) Molecular function prediction for a family exhibiting evolutionary tendencies toward substrate specificity swapping: recurrence of tyrosine aminotransferase activity in the I? subfamily. Proteins 81:1593-609

Showing the most recent 10 out of 14 publications

Comments

Be the first to comment on Barbara Engelhardt's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: