We propose to make the growing body of experimental three-dimensional (3D) RNA structure data more useful to biomedical researchers by providing improved methods to integrate 3D RNA structure with sequence and other experimental data. New annotation tools and services developed in this project will be integrated into the Nucleic Acid Database (NDB) which will provide a platform for disseminating project results. Among the expected benefits are better methods for 1) predicting 3D structures of functional RNA motifs from sequence, 2) searching for non-coding RNA genes in genomes, and 3) improving alignments of homologous RNA sequences. We focus attention on recurrent, modular RNA 3D motifs, which occur in a wide variety of structured RNA molecules, and which give RNA its distinctive 3D shape. This includes hairpin loops, internal loops, junction loops, and tertiary interaction motifs. We will develop systematic methods to identify, classify, and name recurrent RNA 3D motifs and to define search criteria to reliably find instances of each motif in 3D structures. An annotation procedure will be established so that new motifs are rapidly identified in new structures and vetted in collaboration with other members of the RNA Ontology Consortium. All experimental RNA 3D structures will be annotated with lists of motifs. A Motif Atlas will be created to make information about 3D motif instances in structures available to users. This new Atlas containing the annotation of motifs will be added to the Nucleic Acid Database (NDB), a web resource containing structural and functional annotation of nucleic acid containing macromolecules. An update procedure will be developed such that motif data and Atlas entries will automatically be added to the NDB as new RNA structures become available in the PDB archive. We will extend the query capabilities of the NDB with tools for users to search the NDB for RNA motifs using multiple criteria and to integrate search results with experimental confidence measures. We will maintain statistics on the occurrences of motifs and base pairing interactions, incorporating experimental confidence measures, and make these data available as a resource for refinement and validation tools. Each entry in the Motif Atlas will include a structural alignment of all instances of the motif to reveal sequence variants for each motif, including patterns of insertions and deletions. These data will be combined with statistical covariation data for Watson-Crick and non-Watson-Crick basepairs and statistical data for base-stacking and base- backbone interactions to develop probabilistic models for the sequence variability of each modular RNA 3D motif. These models will be used to deploy a web-based tool for users to find the 3D motif from the Motif Atlas which best matches the sequences of hairpin, internal, or junction loops that they submit.

Public Health Relevance

Recent work shows that most of the human genome is transcribed, most of the produced RNA is non-protein coding, and a large fraction of it is critical for human reproduction, growth, and development. This proposal aims to make the growing body of experimental three-dimensional (3D) RNA structure data more useful to the biomedical research community by providing improved methods to integrate 3D RNA structure with sequence and other experimental data.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM085328-03
Application #
8312563
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter C
Project Start
2010-08-10
Project End
2014-07-31
Budget Start
2012-08-01
Budget End
2013-07-31
Support Year
3
Fiscal Year
2012
Total Cost
$304,899
Indirect Cost
$52,196
Name
Bowling Green State University
Department
Chemistry
Type
Schools of Arts and Sciences
DUNS #
617407325
City
Bowling Green
State
OH
Country
United States
Zip Code
43403
Akkuratov, Evgeny E; Walters, Lorraine; Saha-Mandal, Arnab et al. (2014) Bioinformatics analysis of plant orthologous introns: identification of an intronic tRNA-like sequence. Gene 548:81-90
Coimbatore Narayanan, Buvaneswari; Westbrook, John; Ghosh, Saheli et al. (2014) The Nucleic Acid Database: new features and capabilities. Nucleic Acids Res 42:D114-22
Rahrig, Ryan R; Petrov, Anton I; Leontis, Neocles B et al. (2013) R3D Align web server for global nucleotide to nucleotide alignments of RNA 3D structures. Nucleic Acids Res 41:W15-21
Havrila, Marek; Reblova, Kamila; Zirbel, Craig L et al. (2013) Isosteric and nonisosteric base pairs in RNA motifs: molecular dynamics and bioinformatics study of the sarcin-ricin internal loop. J Phys Chem B 117:14302-19
Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B (2013) Automated classification of RNA 3D motifs and the RNA 3D Motif Atlas. RNA 19:1327-40
Abu Almakarem, Amal S; Petrov, Anton I; Stombaugh, Jesse et al. (2012) Comprehensive survey and geometric classification of base triples in RNA structures. Nucleic Acids Res 40:1407-23
Petrov, Anton I; Zirbel, Craig L; Leontis, Neocles B (2011) WebFR3D--a server for finding, aligning and analyzing recurrent RNA 3D motifs. Nucleic Acids Res 39:W50-5