Integrating epigenomics with DNA breathing dynamics for human non-coding disease variants

Alexandrov, Boian; Duan, Jubao; He, Xin

Abstract

Genome-wide association studies (GWAS) and whole genome sequencing of complex diseases have revealed a plethora of disease risk variants, most of which lie in noncoding regions of DNA without easily interpretable function. A main functional mechanism of noncoding variants is to alter chromatin accessibility to transcription factors (TFs), thereby influencing gene expression. Predicting the effects of noncoding variants on TF binding and gene expression on a large scale is thus important but remains challenging. Available computational tools for predicting regulatory variants largely rely on TF-binding motif models and/or local chromatin modification features. Here, we aim to develop a novel computational framework to address two major limitations of these methods. First, given that known disease causal noncoding variants often reside outside of TF binding motifs, how can we improve the prediction of TF binding variants outside of motifs? For this, we plan to integrate TF ChIP-seq data with features that are important for TF binding but have not been considered in previous methods, in particular the DNA breathing dynamics (AIM1). DNA breathing reflects local transient opening of the DNA double helix due to thermal fluctuations. We have shown that genetic variants can affect nearby (up to a few hundred base pairs) DNA breathing dynamics that affect TF binding. Using TF ChIP-seq data, we will train models that predict specific TF binding variants in or outside TF motifs, incorporating DNA breathing dynamics with other features such as DNA shapes and cooperative TF binding. Secondly, given that chromatin features only show modest (<2-fold) enrichment of genetic variants associated with complex diseases or traits, how can we improve the prediction of regulatory variants? For this, we will build a computation model, considering the allele-specific chromatin accessibility (ASCA; i.e., two alleles of a heterozygous individual show read imbalance in chromatin accessibility assays) as a functional readout of a regulatory variant (AIM2). We have shown that neuronal ASCA SNPs are highly enriched for those implicated by schizophrenia (SZ) GWAS. Using neuronal ASCA data, we will train models that predict variants with regulatory effects, taking advantage of our TF-specific classifiers (from AIM1). As a proof of concept, the models will be applied to a large SZ GWAS dataset to predict putative causal regulatory variants. We will validate the effects of the predicted top-ranking regulatory SZ variants on gene expression in a well-powered hiPSC sample by combining multiplex CRISPR-based SNP editing and single-cell RNA-seq analysis (AIM3). For SNPs showing the strongest regulatory effects, we will further use CRISPR editing to verify the SNP effect on gene expression and disease-relevant neuronal phenotypes. Accurately predicting TF-affecting noncoding variants will enable better understanding of the large number of noncoding variants implicated in complex disorders and help formulate testable biological hypotheses, ultimately facilitating the development of targeted therapeutics.

Public Health Relevance

We will develop novel computational methods and a cost-effective functional validation approach to systematically infer the effect of disease-associated noncoding variants on transcription factor binding and gene expression. Identifying the functional noncoding variants that are associated with disease risk will help illuminate causal molecular mechanisms, facilitating the clinical translation of genetic findings into disease risk prediction and treatment.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of Mental Health (NIMH)
Type: Research Project (R01)
Project #: 5R01MH116281-03
Application #: 10115126
Study Section: Molecular Neurogenetics Study Section (MNG)
Program Officer: Beer, Rebecca Lynn

Project Start: 2019-04-08
Project End: 2024-01-31
Budget Start: 2021-02-01
Budget End: 2022-01-31
Support Year: 3
Fiscal Year: 2021
Total Cost
Indirect Cost

Institution

Name: Northshore University Healthsystem
Department
Type
DUNS #: 069490621

City: Evanston
State: IL
Country: United States
Zip Code: 60201

Related projects


NIH 2021 R01 MH	Integrating epigenomics with DNA breathing dynamics for human non-coding disease variants Alexandrov, Boian Stoianov; Duan, Jubao; He, Xin / Northshore University Healthsystem
NIH 2020 R01 MH	Integrating epigenomics with DNA breathing dynamics for human non-coding disease variants Alexandrov, Boian Stoianov; Duan, Jubao; He, Xin / Northshore University Healthsystem
NIH 2019 R01 MH	Integrating epigenomics with DNA breathing dynamics for human non-coding disease variants Alexandrov, Boian Stoianov; Duan, Jubao; He, Xin / Northshore University Healthsystem

Comments

Be the first to comment on Boian Alexandrov's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: