Long non-coding RNAs (lncRNAs) play regulatory roles in biological cell process and disease development. It has been emerging as a key regulator of diverse cellular processes. Great efforts have been made towards investigation of lncRNA functions with both experimental determination and theoretical modeling, leading to a rudimentary understanding of this class of RNAs. However, all of these cannot keep pace with the fast growth of diverse genetic data and urgent request of individual lncRNA function annotation, which is inhibited by the tremendous amount of lncRNAs and expensive experimental cost. This propose aim to address this issue by providing efficient and user-friendly tools for key lncRNA discovery and lncRNA function annotation. To do so, we will develop a unique bioinformatics and Systems Biology integrated approach, ISSNLncFA system, which enables the integration of all sorts of omics data and a comprehensive understanding of lncRNA functions. We propose three specific aims for the ultimate lncRNA function annotation: (1) To develop a novel Co- Modules-based LncRNA Function Annotation (CoMoLncFA) model to detect key lncRNAs and to annotate lncRNA functions at post transcription level as lncRNA-PCG co-modules, lncRNA-pathways association network and lncRNA?s triplets (lncRNA-miRNA/TF-PCG) by considering the expression profiles of lncRNA, protein coding genes and miRNAs and transcript factors, and integrating the curated protein-protein interactions and biological pathways. (2) To develop a novel Structure-based LncRNA-protein Function Annotation (STRULncFA) model to characterize lncRNAs identified from Aim 1 by using their primary sequences and secondary structures for detecting lncRNA-protein functional relations; and to further reveal the regulatory roles and mechanism of these lncRNAs by determining the binding sites in both lncRNA and protein. (3) To experimentally validate the identified abnormal lncRNAs and their cellular products, to validate the identified lncRNA-protein interacting pairs and the predicted binding sites, and to develop software tools and an environment for functional annotation of lncRNAs, use these tools to evaluate the overall proposed approach, and apply them to identify lncRNA functions that may be involved in cell states, species, diseases and cancers and build lncRNA function databases. We believe that we will build the models, tools and databases, and make them available to the public in a timely fashion. Our achievements will lead to a complete understanding of lncRNA functions and regulatory roles in cell and disease states. Moreover, our models and tools will be feasibly transformed to other function annotation tasks and disease studies with appropriate changes, and thus will move forward the general function annotation community and disease-related drug or therapy development.

Public Health Relevance

The ISSNLncFA software package will allow the investigators to characterize the LncRNA functions by integrating RNA-Seq, micorRNA-Seq, and together with protein and RNA structure data with systems biology approaches. As a prototype to test our system, we aim to use this system to study myelodysplastic syndromes (MDSs). Although MDS is used as the prototype of disease for this proposal, the system developed will be applicable to multiple diseases with complex phenotypes.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM123037-04
Application #
9983714
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2017-09-15
Project End
2021-07-31
Budget Start
2020-08-01
Budget End
2021-07-31
Support Year
4
Fiscal Year
2020
Total Cost
Indirect Cost
Name
University of Texas Health Science Center Houston
Department
Type
Sch Allied Health Professions
DUNS #
800771594
City
Houston
State
TX
Country
United States
Zip Code
77030
Adjeroh, Donald; Allaga, Maen; Tan, Jun et al. (2018) Feature-Based and String-Based Models for Predicting RNA-Protein Interaction. Molecules 23:
Xu, Yungang; Zhao, Weiling; Olson, Scott D et al. (2018) Alternative splicing links histone modifications to stem cell fate decision. Genome Biol 19:133
Chyr, Jacqueline; Guo, Dongmin; Zhou, Xiaobo (2018) LSCC SNP variant regulates SOX2 modulation of VDAC3. Oncotarget 9:22340-22352
Luo, Jiesi; Liu, Liang; Venkateswaran, Suresh et al. (2017) RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites. Sci Rep 7:614
Liu, Keqin; Beck, Dominik; Thoms, Julie A I et al. (2017) Annotating function to differentially expressed LincRNAs in myelodysplastic syndrome using a network-based method. Bioinformatics 33:2622-2630
Yu, Wenshuai; Zhao, Shengjie; Wang, Yongcui et al. (2017) Identification of cancer prognosis-associated functional modules using differential co-expression networks. Oncotarget 8:112928-112941