Long non-coding RNAs (lncRNAs) play regulatory roles in biological cell process and disease development. It has been emerging as a key regulator of diverse cellular processes. Great efforts have been made towards investigation of lncRNA functions with both experimental determination and theoretical modeling, leading to a rudimentary understanding of this class of RNAs. However, all of these cannot keep pace with the fast growth of diverse genetic data and urgent request of individual lncRNA function annotation, which is inhibited by the tremendous amount of lncRNAs and expensive experimental cost. This propose aim to address this issue by providing efficient and user-friendly tools for key lncRNA discovery and lncRNA function annotation. To do so, we will develop a unique bioinformatics and Systems Biology integrated approach, ISSNLncFA system, which enables the integration of all sorts of omics data and a comprehensive understanding of lncRNA functions. We propose three specific aims for the ultimate lncRNA function annotation: (1) To develop a novel Co- Modules-based LncRNA Function Annotation (CoMoLncFA) model to detect key lncRNAs and to annotate lncRNA functions at post transcription level as lncRNA-PCG co-modules, lncRNA-pathways association network and lncRNA?s triplets (lncRNA-miRNA/TF-PCG) by considering the expression profiles of lncRNA, protein coding genes and miRNAs and transcript factors, and integrating the curated protein-protein interactions and biological pathways. (2) To develop a novel Structure-based LncRNA-protein Function Annotation (STRULncFA) model to characterize lncRNAs identified from Aim 1 by using their primary sequences and secondary structures for detecting lncRNA-protein functional relations; and to further reveal the regulatory roles and mechanism of these lncRNAs by determining the binding sites in both lncRNA and protein. (3) To experimentally validate the identified abnormal lncRNAs and their cellular products, to validate the identified lncRNA-protein interacting pairs and the predicted binding sites, and to develop software tools and an environment for functional annotation of lncRNAs, use these tools to evaluate the overall proposed approach, and apply them to identify lncRNA functions that may be involved in cell states, species, diseases and cancers and build lncRNA function databases. We believe that we will build the models, tools and databases, and make them available to the public in a timely fashion. Our achievements will lead to a complete understanding of lncRNA functions and regulatory roles in cell and disease states. Moreover, our models and tools will be feasibly transformed to other function annotation tasks and disease studies with appropriate changes, and thus will move forward the general function annotation community and disease-related drug or therapy development.

Public Health Relevance

The ISSNLncFA software package will allow the investigators to characterize the LncRNA functions by integrating RNA-Seq, micorRNA-Seq, and together with protein and RNA structure data with systems biology approaches. As a prototype to test our system, we aim to use this system to study myelodysplastic syndromes (MDSs). Although MDS is used as the prototype of disease for this proposal, the system developed will be applicable to multiple diseases with complex phenotypes.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Resat, Haluk
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Texas Health Science Center Houston
Sch Allied Health Professions
United States
Zip Code
Adjeroh, Donald; Allaga, Maen; Tan, Jun et al. (2018) Feature-Based and String-Based Models for Predicting RNA-Protein Interaction. Molecules 23:
Xu, Yungang; Zhao, Weiling; Olson, Scott D et al. (2018) Alternative splicing links histone modifications to stem cell fate decision. Genome Biol 19:133
Chyr, Jacqueline; Guo, Dongmin; Zhou, Xiaobo (2018) LSCC SNP variant regulates SOX2 modulation of VDAC3. Oncotarget 9:22340-22352
Luo, Jiesi; Liu, Liang; Venkateswaran, Suresh et al. (2017) RPI-Bind: a structure-based method for accurate identification of RNA-protein binding sites. Sci Rep 7:614
Liu, Keqin; Beck, Dominik; Thoms, Julie A I et al. (2017) Annotating function to differentially expressed LincRNAs in myelodysplastic syndrome using a network-based method. Bioinformatics 33:2622-2630
Yu, Wenshuai; Zhao, Shengjie; Wang, Yongcui et al. (2017) Identification of cancer prognosis-associated functional modules using differential co-expression networks. Oncotarget 8:112928-112941