;RNA Secondary Structure Prediction and Analysis Software: We continue to improve upon an RNA folding algorithm (MPGAfold) that uses concepts from genetic algorithms and apply this algorithm to various biological problems (see below). An optimized version that was adapted to run on LINUX clusters, using MPI is available upon request. The algorithm is capable of predicting RNA pseudoknots and exploring folding pathways that contain multiple functional conformations. A Java-based visualizer for depicting population evolution has also been developed which when coupled with the MPI version of MPGAfold makes the system more user friendly and portable and allows for a detailed exploration of the structure population space. STRUCTURELAB, the heterogeneous bioinformatical RNA analysis workbench, which permits the use of a broad array of approaches for RNA structure analysis, has been continually enhanced. It has been used for the visualization of folding pathways in conjunction with the genetic algorithm and with dynamic programming algorithms (e.g. mFOLD). STRUCTURELAB and other new tools contain several features which when used together, act as set of data mining methodologies to aid in the discovery of RNA folding patterns. These systems have been adapted to other environments inside and outside our laboratory and NIH and are available for download from our newly enhanced Web site. KNetFold our new algorithm for RNA secondary structure prediction has also been enhanced. The methodology integrates thermodynamic and compensatory base change information using an innovative machine-learning algorithm (a hierarchical network of k-nearest neighbor classifiers). KNetFold has been shown to outperform other RNA secondary structure prediction programs. Another program CorreLogo has also been enhanced. It depicts in a 3-dimensional plot correlations that exist between base pairs in a secondary structures. These methodologies use mutual information derived from a sequence alignment. Both KnetFold and CorreLogo can be found as Web servers on our website. They are also downloadable from our newly configured web site. We developed a new Web server called RADAR that provides a multitude of functionality for RNA data analysis. It can align structure-annotated RNA sequences so that both sequence and structure information are used during the alignment process. This server can perform pairwise structure alignment, multiple structure alignment, database search and clustering. RADAR provides two major features. It can perform constrained alignment of RNA secondary structures, and the prediction of the consensus structure for a set of RNA sequences. In addition, a new RNA secondary structure Web accessible database, RmotifDB was developed. RmotifDB is also integrated with a gene ontology database. Algorithms have been developed to determine regions within genomes that are indicative of non-coding RNAs. Structural characteristics which distinguish those regions that may occur in intergenic or control regions of RNA are being determined. These methods have been applied in various biological contexts. Determination of Biologically Related RNA Secondary Structure Folding Characteristics: The above described computational tools have been employed in studying RNA structural characteristics, folding pathways and functional intermediates of various RNAs. These are exemplified by the analysis of the folding pathways of the HIV 5' and 3' non-coding regions; the control mechanisms of the hepatitis delta virus, interlukin-2, rotavirus, dengue fever (and the flaviviruses in general) and the turnip crinkle virus. They are also providing insight into cancer development that is inducible by the up regulation of eIF4E or controlled by the presence or absence of PDCD4. Algorithms that were developed for the determination of non-coding RNAs in genomes are being applied to the intergenic regions of E. coli, Musashi binding sites and to determine potential structural RNA elements that are involved in RNA translation initiation. Each of these sites is proving to contain unique features and characteristics that are inherent to the different biological domains being examined. Software for RNA 3D Structure Prediction: We developed a program, RNA2D3D, for interactively exploring RNA 3D conformations at the all atom level. It works on the premise of generating three-dimensional models of a RNA from a given RNA secondary structure. The secondary structure may be generated by any one of several methodologies. The program generates structures very rapidly, allowing the import, for example, of a secondary structure description of a 1542 base 16S rRNA and generating a rough 3D model in less than a second. As part of the input secondary structure representation, the specification of pseudoknot interactions is also permitted. RNA2D3D also allows for the specification of coaxial stacking between stems and compactification, which essentially extends an A-form helix into loops that can form non-canonical base pairs. This latter feature is quite helpful because often loops are not found as single strands. General molecular editing features such as rotating or translating conformational segments are also permitted. RNA motifs from the Protein Databank (PDB) can also be imported and attached to the modeled structure. The Tinker molecular dynamics software can be invoked permitting the refinement of bond angles and distances. Three-Dimensional RNA Structural Modeling and Analysis: We have predicted the structure of the wild-type telomerase pseudoknot and have done molecular dynamics studies on the RNA hairpin and pseudoknot that are important structural elements in telomerase. These studies show that an unusual sequence of non-canonical base pairs have dynamic conformational characteristics that induce the formation of the pseudoknot. These results have significant implications concerning genetic diseases such as dyskeratosis congenita, aplastic anemia and cancer. New Paradigm for Translational Enhancement Discovered: We applied our secondary structure prediction methodologies, 3-D modeling methodologies, including the molecular modeling software RNA2D3D, and molecular dynamics to discover a unique motif in the turnip crinkle virus 3' UTR. It is becoming evident that the 3' UTRs of cellular and viral mRNAs harbor elements that function in gene expression by enhancing translation using as yet unknown mechanisms. Some of these cellular mRNAs, including ornithine decarboxylase, encode products whose overproduction leads to cell proliferation. To determine the function of these elements, we employed a simple model virus, Turnip crinkle virus (TCV). TCV, like many plant viruses, is translated in a cap-independent fashion and contains a 3' proximal region that together with the 5' UTR synergistically enhances translation. We have gained a significant understanding of the function of this 3' element. We used MPGAfold to identify a series of hairpins and one pseudoknot that have been confirmed genetically. Using this RNA secondary structural information in conjunction with RNA2D3D for RNA 3D molecular modeling we predicted that a series of three hairpins and two pseudoknots structurally resembled a tRNA. This has been experimentally verifi [summary truncated at 7800 characters]
Showing the most recent 10 out of 16 publications