The non-coding RNAs play many functional roles in biological processes, such as catalysis, gene expression regulation and RNA splicing. The various roles played by non-coding RNA are determined by their character- istic structure. RNA structural motifs are recurrent structural components in the non-coding RNAs. The RNA structural motifs have conserved structures, and therefore, have conserved biological or structural functions. For instance, the kink-turn motif is found in different kinds of non-coding RNAs and all of them are responsible for protein binding activities. The alternation of their structures will result in loss-of-function of the RNA structural motif, and in some cases severe diseases. For example, the destruction of kink-turn motif in small nucleolar RNA (snoRNA) will prevent it from recruiting the L7Ae protein, and thus lead to Dyskeratosis congenita and Prader-Willi syndrome. Therefore, the study of RNA structural motif will help us to elucidate the mechanisms of many diseases and lead to the development of novel treatment strategies. Currently, the essential RNA struc- tural motif research includes the following problems: 1) identifying all occurrences of the given motif (search), 2), classifying known motif instances based on their structures and functionalities (classification), and 3) defin- ing novel RNA structural motif families (de novo discovery). In this proposal, we aim at devising a suite of computational methods to solve these three problems. First, we will develop a new computational search tool which will, in addition to 3D geometry, take into account base pairing (hydrogen bonding forces) and base stack- ing (magnetic and electrostatic forces) information. Most of the existing RNA structural motif search tools show limitations in detecting motif instances with flexible geometry. The inclusion of base pairing and base stacking will resolve this issue. Second, we will develop a novel clustering strategy to solve the classification and de novo discovery problems simultaneously. Existing clustering strategies adopt length-dependent structural alignment score (which indicates the structural similarity between two candidate motif instances) as the distance measure- ment, and apply hierarchical clustering algorithm to identify closely related motif clusters. We plan to include a statistical framework that can normalize the alignment score, and thus resolve this issue. In addition, instead of hierarchical clustering algorithm, we will adopt clique-finding algorithm in our clustering strategy, so as to make it applicable to large data sets. We will examine the resulting clusters and compare them with known motifs, and then suggest novel RNA structural motif families. With the achievement of these two goals, we propose to build a database for archiving motif instances identified by our new search tool. Finally, we will report potential novel RNA structural motif families and encourage experimental investigation of their functionalities. We expect that the proposed work will lead to better understanding of the RNA structural motifs, and significantly promote biomedical research.

Public Health Relevance

RNA structural motifs are components in non-coding RNAs, which play catalytic, regulatory and other important roles in many biological processes. The dysfunction of RNA structural motif will result in physiological disorders and cause diseases (such as Dyskeratosis congenita and Prader-Willi syndrome). We plan to devise a suite of computational methods for RNA structural motif search, classification, and discovery, so as to elucidate the mechanisms of RNA structural motif related diseases and push forward the development of their treatment strategies.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM102515-02
Application #
8535798
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Brazhnik, Paul
Project Start
2012-09-01
Project End
2015-08-31
Budget Start
2013-09-01
Budget End
2014-08-31
Support Year
2
Fiscal Year
2013
Total Cost
$164,019
Indirect Cost
$40,981
Name
University of Central Florida
Department
Engineering (All Types)
Type
Schools of Engineering
DUNS #
150805653
City
Orlando
State
FL
Country
United States
Zip Code
32826
Ma, Hanhui; Tu, Li-Chun; Naseri, Ardalan et al. (2016) CRISPR-Cas9 nuclear dynamics and target recognition in living cells. J Cell Biol 214:529-37
Ma, Hanhui; Tu, Li-Chun; Naseri, Ardalan et al. (2016) Multiplexed labeling of genomic loci with dCas9 and engineered sgRNAs using CRISPRainbow. Nat Biotechnol 34:528-30
Holzhauser, Erwin; Ge, Ping; Zhang, Shaojie (2016) WebSTAR3D: a web server for RNA 3D structural alignment. Bioinformatics 32:3673-3675
Zhong, Cuncong; Zhang, Shaojie (2015) RNAMotifScanX: a graph alignment approach for RNA structural motif identification. RNA 21:333-46
Ge, Ping; Zhang, Shaojie (2015) Computational analysis of RNA structures with chemical probing data. Methods 79-80:60-6
Ge, Ping; Zhang, Shaojie (2015) STAR3D: a stack-based RNA 3D structural alignment tool. Nucleic Acids Res 43:e137
Ma, Hanhui; Naseri, Ardalan; Reyes-Gutierrez, Pablo et al. (2015) Multicolor CRISPR labeling of chromosomal loci in human cells. Proc Natl Acad Sci U S A 112:3002-7
Ge, Ping; Zhong, Cuncong; Zhang, Shaojie (2014) ProbeAlign: incorporating high-throughput sequencing-based structure probing information into ncRNA homology search. BMC Bioinformatics 15 Suppl 9:S15
Zhong, Cuncong; Zhang, Shaojie (2014) Simultaneous folding of alternative RNA structures with mutual constraints: an application to next-generation sequencing-based RNA structure probing. J Comput Biol 21:609-21
Zhong, Cuncong; Andrews, Justen; Zhang, Shaojie (2014) Discovering non-coding RNA elements in Drosophila 3' untranslated regions. Int J Bioinform Res Appl 10:479-97

Showing the most recent 10 out of 13 publications