Genome-wide association studies (GWASs) provide no mechanism of disease. Moving from GWAS to mechanism requires priori knowledge of cis-regulatory elements (CREs), as most risk variants reside in noncoding regions. Therefore, methods for defining noncoding networks consisting of CREs and their gene targets are essential. Current methods for CRE prediction are based on histone marks, transcription factor binding, and evolutionary sequence conservation. These methods have high sensitivity, but poor specificity. For example, over 6k enhancers were reported in human heart. Subsequent methods to predict CRE targets inherit poor specificity. Here, we propose new strategy to precisely predict risk CREs and target genes from noncoding and coding transcriptomes. Emerging evidence suggests that functional CREs are themselves transcribed. We posit that tissue-specific expression in CRE transcripts can quantitatively define prioritize functional CREs from candidate loci. We posit that a CRE transcript and its gene targets coordinately expressed, providing a quantitative measure of CRE-target gene activity. We will utilize CRE transcription to prioritize CREs harboring candidate functional genetic variation. To this end, we need to systematically mine multi-scale high-throughput data, which requires efficient bioinformatics methods. In this proposal, we will develop a non-coding transcriptome model. First, we will integrate haplotype blocks of GWAS findings, disease-dependent noncoding expression, and distal chromatin interaction into a computational prioritization of CRE-promoter pairs. We will introduce our ?soft threshold?-algorithm to assess chromatin accessible CREs together with their transcribed ncRNAs. Second, building on our early success of PGnet algorithm, we will build a tripartite network (disease traits, CRE variants, and target genes) with a nonparametric model to infer risk CREs and gene targets. We will develop novel gene-based association and compared to PrediXscan. We will compare our prediction with TargetFinder. Collaborated with an expert in cardiac conduction system, I select atrial fibrillation (AF) as a research platform. AF is the most common human arrhythmias, affecting over 33 million people worldwide. The deliverables of this project will include critical cardiac rhythm CREs, CRE variants, and target genes. The ultimate goal of this work is an evaluated computational model to set the stage for future functional evaluations. This new analytic suit to define AF-associated noncoding regulatory pathway will have border implication on other diseases.

Public Health Relevance

I have long been committed to the development of statistical algorithms and software tools for an effective functional integration of transcriptome with clinical endpoints (e.g. the Bioconductor packages OrderedList and seq2pathway, the online tools GOModule). This project will systematically build new predictive models to define noncoding regulatory networks (consisting of cis-regulators and target genes) underlying disease. Using atrial fibrillation as a research platform, this proposal proposes a novel noncoding transcriptome approach, which will set the stage for future functional evaluations.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Exploratory/Developmental Grants (R21)
Project #
1R21LM012619-01
Application #
9374995
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Ye, Jane
Project Start
2017-09-07
Project End
2019-08-31
Budget Start
2017-09-07
Budget End
2018-08-31
Support Year
1
Fiscal Year
2017
Total Cost
Indirect Cost
Name
University of Chicago
Department
Pediatrics
Type
Schools of Medicine
DUNS #
005421136
City
Chicago
State
IL
Country
United States
Zip Code
60637
Steimle, Jeffrey D; Rankin, Scott A; Slagle, Christopher E et al. (2018) Evolutionarily conserved Tbx5-Wnt2/2b pathway orchestrates cardiopulmonary development. Proc Natl Acad Sci U S A 115:E10615-E10624
Yang, Xinan H; Nadadur, Rangarajan D; Hilvering, Catharina Re et al. (2017) Transcription-factor-dependent enhancer transcription defines a gene regulatory network for cardiac rhythm. Elife 6: