The goal of this project is to functionally annotate genetic variants in post-transcriptional regulation of RNA expression, which extends and complements the current focus of ENCODE data analysis. Recently, tremendous success has been achieved in constructing a catalog of genetic variants in disease genomes or across population. The next great challenge is to identify causal variants and elucidate their potential function in biological and disease processes. To this end, research efforts have been directed to studying variants located in protein-coding, promoter, and splice site regions due to their apparent impacts on gene expression. However, many of the newly identified disease-associated variants reside in other non-coding regions, such as introns, that may confer regulatory function to the related gene. The mechanisms of these variants have been hard to decipher. It is expected that many of them may function at the post-transcriptional level, thus affecting mRNA expression. In human, a myriad of processes mediate RNA expression at the post-transcriptional stage, such as splicing, editing, polyadenylation and mRNA decay. Post-transcriptional regulation is extremely versatile, yet closely regulated, affecting most human genes. Despite the importance, how to accurately identify functional genetic variants in these processes remains a key question in the field. To address this question, the large collection of ENCODE expression and protein-binding data represent an invaluable resource. We will develop novel methodologies to make full use of the ENCODE and other publicly available data sets, complemented by further bioinformatic prediction and experimental validations. This work will allow a previously unattained level of understanding of genetic variants in post-transcriptional regulation of RNA expression and provide new means to tackle the imperative task of functional annotations of genetic variants.

Public Health Relevance

Post-transcriptional regulation can significantly alter gene expression and contribute to human diseases. The proposed research aims to functionally annotate genetic variants in post- transcriptional regulation of mRNA expression using ENCODE data. This work will provide mechanistic basis for how genetic variations may contribute to diseases, such that future interventions can target specific genes therapeutically.

Agency
National Institute of Health (NIH)
Institute
National Human Genome Research Institute (NHGRI)
Type
Research Project--Cooperative Agreements (U01)
Project #
1U01HG009417-01
Application #
9247517
Study Section
Special Emphasis Panel (ZHG1-HGR-L (O1))
Program Officer
Feingold, Elise A
Project Start
2017-02-01
Project End
2021-01-31
Budget Start
2017-02-01
Budget End
2018-01-31
Support Year
1
Fiscal Year
2017
Total Cost
$426,673
Indirect Cost
$115,674
Name
University of California Los Angeles
Department
Physiology
Type
Schools of Arts and Sciences
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Arefeen, Ashraful; Liu, Juntao; Xiao, Xinshu et al. (2018) TAPAS: tool for alternative polyadenylation site analysis. Bioinformatics 34:2521-2529
Hsiao, Yun-Hua Esther; Bahn, Jae Hoon; Yang, Yun et al. (2018) RNA editing in nascent RNA affects pre-mRNA splicing. Genome Res 28:812-823
Harvey, Samuel E; Xu, Yilin; Lin, Xiaodan et al. (2018) Coregulation of alternative splicing by hnRNPM and ESRP1 during EMT. RNA 24:1326-1338
Brümmer, Anneke; Yang, Yun; Chan, Tracey W et al. (2017) Structure-mediated modulation of mRNA abundance by A-to-I editing. Nat Commun 8:1255