The non-coding RNAs play many functional roles in biological processes, such as catalysis, gene expression regulation and RNA splicing. The various roles played by non-coding RNA are determined by their character- istic structure. RNA structural motifs are recurrent structural components in the non-coding RNAs. The RNA structural motifs have conserved structures, and therefore, have conserved biological or structural functions. For instance, the kink-turn motif is found in different kinds of non-coding RNAs and all of them are responsible for protein binding activities. The alternation of their structures will result in loss-of-function of the RNA structural motif, and in some cases severe diseases. For example, the destruction of kink-turn motif in small nucleolar RNA (snoRNA) will prevent it from recruiting the L7Ae protein, and thus lead to Dyskeratosis congenita and Prader-Willi syndrome. Therefore, the study of RNA structural motif will help us to elucidate the mechanisms of many diseases and lead to the development of novel treatment strategies. Currently, the essential RNA struc- tural motif research includes the following problems: 1) identifying all occurrences of the given motif (search), 2), classifying known motif instances based on their structures and functionalities (classification), and 3) defin- ing novel RNA structural motif families (de novo discovery). In this proposal, we aim at devising a suite of computational methods to solve these three problems. First, we will develop a new computational search tool which will, in addition to 3D geometry, take into account base pairing (hydrogen bonding forces) and base stack- ing (magnetic and electrostatic forces) information. Most of the existing RNA structural motif search tools show limitations in detecting motif instances with flexible geometry. The inclusion of base pairing and base stacking will resolve this issue. Second, we will develop a novel clustering strategy to solve the classification and de novo discovery problems simultaneously. Existing clustering strategies adopt length-dependent structural alignment score (which indicates the structural similarity between two candidate motif instances) as the distance measure- ment, and apply hierarchical clustering algorithm to identify closely related motif clusters. We plan to include a statistical framework that can normalize the alignment score, and thus resolve this issue. In addition, instead of hierarchical clustering algorithm, we will adopt clique-finding algorithm in our clustering strategy, so as to make it applicable to large data sets. We will examine the resulting clusters and compare them with known motifs, and then suggest novel RNA structural motif families. With the achievement of these two goals, we propose to build a database for archiving motif instances identified by our new search tool. Finally, we will report potential novel RNA structural motif families and encourage experimental investigation of their functionalities. We expect that the proposed work will lead to better understanding of the RNA structural motifs, and significantly promote biomedical research.

Public Health Relevance

RNA structural motifs are components in non-coding RNAs, which play catalytic, regulatory and other important roles in many biological processes. The dysfunction of RNA structural motif will result in physiological disorders and cause diseases (such as Dyskeratosis congenita and Prader-Willi syndrome). We plan to devise a suite of computational methods for RNA structural motif search, classification, and discovery, so as to elucidate the mechanisms of RNA structural motif related diseases and push forward the development of their treatment strategies.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Central Florida
Engineering (All Types)
Schools of Engineering
United States
Zip Code
Ge, Ping; Zhong, Cuncong; Zhang, Shaojie (2014) ProbeAlign: incorporating high-throughput sequencing-based structure probing information into ncRNA homology search. BMC Bioinformatics 15 Suppl 9:S15
Zhong, Cuncong; Zhang, Shaojie (2014) Simultaneous folding of alternative RNA structures with mutual constraints: an application to next-generation sequencing-based RNA structure probing. J Comput Biol 21:609-21
Zhong, Cuncong; Andrews, Justen; Zhang, Shaojie (2014) Discovering non-coding RNA elements in Drosophila 3' untranslated regions. Int J Bioinform Res Appl 10:479-97
Li, Yuan; Zhong, Cuncong; Zhang, Shaojie (2014) Finding consensus stable local optimal structures for aligned RNA sequences and its application to discovering riboswitch elements. Int J Bioinform Res Appl 10:498-518
Ge, Ping; Zhang, Shaojie (2013) Incorporating phylogenetic-based covarying mutations into RNAalifold for RNA consensus structure prediction. BMC Bioinformatics 14:142
Zhong, Cuncong; Zhang, Shaojie (2013) Efficient alignment of RNA secondary structures using sparse dynamic programming. BMC Bioinformatics 14:269