RNA Secondary Structure Prediction and Analysis Software: We continue to improve upon an RNA folding algorithm (MPGAfold) that uses concepts from genetic algorithms and apply this algorithm to various biological problems (see below). An optimized version that was adapted to run on LINUX clusters, using MPI is available upon request. The algorithm is capable of predicting RNA pseudoknots and exploring folding pathways that contain multiple functional conformations. A Java-based visualizer for depicting population evolution has also been developed which when coupled with the MPI version of MPGAfold makes the system more user friendly and portable and allows for a detailed exploration of the structure population space. Several groups are now employing the use of these algorithms. STRUCTURELAB, the heterogeneous bioinformatical RNA analysis workbench, which permits the use of a broad array of approaches for RNA structure analysis, has been continually enhanced. It has been used for the visualization of folding pathways in conjunction with the genetic algorithm (see above). STRUCTURELAB and other new tools contain several features which when used together, act as set of data mining methodologies to aid in the discovery of RNA folding patterns. These systems have been adapted to other environments inside and outside our laboratory and NIH and are available for download from our newly enhanced Web site. Many groups from around the world have downloaded this software. KNetFold our new algorithm for RNA secondary structure prediction has also been enhanced. The methodology integrates thermodynamic and compensatory base change information using an innovative machine-learning algorithm (a hierarchical network of k-nearest neighbor classifiers). KnetFold has been shown to outperform other RNA secondary structure prediction programs. Another program CorreLogo has also been enhanced. It depicts in a 3-dimensional plot correlations that exist between base pairs in a secondary structures. These methodologies use mutual information derived from a sequence alignment. Both KnetFold and CorreLogo can be found as Web servers on our website. They are also downloadable from our newly configured web site. Many groups have downloaded these software packages. We have also developed a suite of programs to investigate the sequence and structural characteristics that are inherent in the control of translation initiation of mRNAs. We developed a new Web server called RADAR that provides a multitude of functionality for RNA data analysis. It can align structure-annotated RNA sequences so that both sequence and structure information are used during the alignment process. This server can perform pairwise structure alignment, multiple structure alignment, database search and clustering. RADAR provides two major features. It can perform constrained alignment of RNA secondary structures, and the prediction of the consensus structure for a set of RNA sequences. In addition, a new RNA secondary structure Web accessible database, RmotifDB was developed. RmotifDB is also integrated with a gene ontology database. Algorithms have been developed to determine regions within genomes that are indicative of non-coding RNAs. Structural characteristics which distinguish those regions that may occur in intergenic or control regions of RNA are being determined. These methods have been applied in various biological contexts (see below). Determination of Biologically Related RNA Secondary Structure Folding Characteristics: The above described computational tools have been employed in studying RNA structural characteristics, folding pathways and functional intermediates of various RNAs. These are exemplified by the analysis of the folding pathways of the HIV 5' and 3' non-coding regions; the control mechanisms of the hepatitis delta virus, interlukin-2, rotavirus, dengue fever (and the flaviviruses in general) and the turnip crinkle virus. They are also providing insight into cancer development that is inducible by the up regulation of eIF4E or controlled by the presence or absence of PDCD4. Algorithms that were developed for the determination of non-coding RNAs in genomes are being applied to the intergenic regions of E. coli, Musashi binding sites and to determine potential structural RNA elements that are involved in RNA translation initiation. Each of these sites is proving to contain unique features and characteristics that are inherent to the different biological domains being examined. Three-Dimensional RNA Structural Modeling and Analysis: In order to understand RNA structures, nanostructures, folding pathways and the structural effects of RNA-Protein interactions at the atomic level, structural elements of RNA molecules are being studied using molecular mechanics and molecular dynamics simulations. Studies include RNA tetraloops, bulge loops, kissing loops, n-way junctions and motifs that are of specific importance to the function of molecules that are involved in disease processes. These studies have lead to the understanding of subtle atomic level interactions for RNA molecular function and RNA nanodesign. We have, for example, predicted the structure of the wild-type telomerase pseudoknot and have done molecular dynamics studies on the RNA hairpin and pseudoknot that are important structural elements in telomerase. These studies show that an unusual sequence of non-canonical base pairs have dynamic conformational characteristics that induce the formation of the pseudoknot. These results have significant implications concerning genetic diseases such as dyskeratosis congenita, aplastic anemia and cancer. We have also applied our secondary structure prediction methodologies, 3-D modeling methodologies, including the molecular modeling software RNA2D3D developed with our group, and molecular dynamics to discover a unique motif in the turnip crinkle virus 3 UTR. This structural motif has experimentally been shown to bind ribosomes and to act as a translational enhancer. Thus we have discovered a new paradigm for translational enhancement. We have also applied a somewhat different technique to generate feasible transition pathways between two different 3D conformations of RNA molecules. This utilizes a methodology known as elastic network interpolation (ENI). 3D folding pathways can be generated with good accuracy and much more quickly than with conventional molecular dynamics methods. Research is ongoing to develop a methodology which combines computational modeling with experimental structure determination to elucidate the structure of various RNAs. RNA Nanobiology: We have filed a patent and have a publication related to a novel design for a hexagonal RNA nano-ring and a self-assembling nanotube which is currently undergoing experimental testing. In addition, we developed a web accessible RNA motif database, RNAjunction, consisting of RNA n-way junctions and kissing loops. This database is being used for the construction of RNA tectoshapes. The RNAJunction database contains structure and sequence information for RNA structural elements such as helical junctions, internal loops, bulges and loop-loop interactions. Our database provides a user-friendly way of searching for significant RNA structural motifs. This database is useful for analyzing RNA structures as well as for designing novel RNA structures on a nanoscale. The construction of the RNA structures is being greatly enhanced by the use of our newly developed software NanoTiler. NanoTiler is a comprehensive system that significantly enhances the design of RNA based nanoparticles by allowing for automatic and human interaction in the development, placement and determination of RNA building blocks. Z01 BC 08382-1

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Intramural Research (Z01)
Project #
1Z01BC008382-24
Application #
7592576
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
24
Fiscal Year
2007
Total Cost
$842,930
Indirect Cost
Name
National Cancer Institute Division of Basic Sciences
Department
Type
DUNS #
City
State
Country
United States
Zip Code
de Sousa Abreu, Raquel; Sanchez-Diaz, Patricia C; Vogel, Christine et al. (2009) Genomic analyses of musashi1 downstream targets show a strong association with cancer-related processes. J Biol Chem 284:12125-35
Yingling, Yaroslava G; Shapiro, Bruce A (2007) The impact of dyskeratosis congenita mutations on the structure and dynamics of the human telomerase RNA pseudoknot domain. J Biomol Struct Dyn 24:303-20
Yingling, Yaroslava G; Shapiro, Bruce A (2007) Computational design of an RNA hexagonal nanoring and an RNA nanotube. Nano Lett 7:2328-34
Khaladkar, Mugdha; Bellofatto, Vivian; Wang, Jason T L et al. (2007) RADAR: a web server for RNA data analysis and research. Nucleic Acids Res 35:W300-4
Shapiro, Bruce A; Kasprzak, Wojciech; Grunewald, Calvin et al. (2006) Graphical exploratory data analysis of RNA secondary structure dynamics predicted by the massively parallel genetic algorithm. J Mol Graph Model 25:514-31
Linnstaedt, Sarah D; Kasprzak, Wojciech K; Shapiro, Bruce A et al. (2006) The role of a metastable RNA secondary structure in hepatitis delta virus genotype III RNA editing. RNA 12:1521-33
Seko, Yuko; Cole, Steven; Kasprzak, Wojciech et al. (2006) The role of cytokine mRNA stability in the pathogenesis of autoimmune disease. Autoimmun Rev 5:299-305
Zhang, Jiuchun; Zhang, Guohua; Guo, Rong et al. (2006) A pseudoknot in a preactive form of a viral RNA is part of a structural switch activating minus-strand synthesis. J Virol 80:9181-91
Yingling, Yaroslava G; Shapiro, Bruce A (2006) The prediction of the wild-type telomerase RNA pseudoknot structure and the pivotal role of the bulge in its formation. J Mol Graph Model 25:261-74
Bindewald, Eckart; Shapiro, Bruce A (2006) RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers. RNA 12:342-52

Showing the most recent 10 out of 16 publications