The current proposal is in response to PAR-09-218 (Innovations in Biomedical Computational Science and Technology) and is designed to develop new computational methods for predicting conserved RNA secondary structure for multiple homologs. The objective of the new computational methods is to improve automated RNA secondary structure prediction to match the performance of manual comparative analysis, which, although highly accurate, requires specialized skill and intensive manual effort. We will then apply these methods to determine conserved structures in HIV, SIV, and related immunodeficiency viruses and biochemically test hypotheses about structures required for viral replication. Computational tools for accurate predictions of RNA secondary structure have widespread applications in biology and medicine, contributing to better understanding of RNA function, discovery of new RNA genes, and methods for targeted drug design.
The specific aims of the research proposal are to: 1) Automate comparative sequence analysis, 2) Improve the models that define structure alignment by adding flexibility, and 3) Build structural alignments of HIV, SIV, and related immunodeficiency viruses and and, via biochemical experiments, test the role of putative structures for HIV replication. To achieve Aim 1, we propose an innovative iterative computational framework that computes probabilistic representations of RNA folding and inter-sequence alignment and iterates - updating alignments using folding information and vice versa. The novel aspect of this framework is that the complexity of the alignment and folding tasks in each iteration remains manageable, as though these tasks were performed independently, whereas the accuracy is significantly improved, as though folding and alignment was jointly performed.
Aim 2 addresses a common shortcoming of computational models for RNA folding and alignment. These models define common secondary structure in a rigid manner, which does not comprehend the variation of structure homology seen in RNA in nature, where entire domain insertions or deletions are commonly seen. Our proposal addresses this limitation, within our computational framework of Aim 1, by improving probabilistic alignment models to better comprehend domain insertions and by introducing scoring modifications that allow insertions and deletions of entire domains with modest penalty instead of totally forbidding these. To accomplish Aim 3, we will deploy our computational algorithms to analyze the RNA genomes of HIV and SIV, formulate hypotheses for the roles of structures in RNA replication, based on common features in predicted structures and then test these hypotheses via biochemical experiments.

Public Health Relevance

This proposal has direct public health relevance. We are developing computational tools to predict and understand RNA structure. These can be applied to understanding the biology of infectious diseases because some viruses, including influenza and HIV, are RNA viruses. Furthermore, they can be used to design novel therapeutics, such as antisense oligonucleotides or small interfering RNA that both target RNA that could be used for diseases such as cancer or inherited diseases. In this proposal, we are specifically applying the tools to studying HIV, which could lead to the discovery of new replication mechanisms that could be targeted by therapeutics.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Rochester
Engineering (All Types)
Schools of Engineering
United States
Zip Code
Piekna-Przybylska, Dorota; Sharma, Gaurav; Maggirwar, Sanjay B et al. (2017) Deficiency in DNA damage response, a new characteristic of cells infected with latent HIV-1. Cell Cycle 16:968-978
Tan, Zhen; Fu, Yinghan; Sharma, Gaurav et al. (2017) TurboFold II: RNA structural alignment and secondary structure prediction informed by multiple homologs. Nucleic Acids Res 45:11570-11581
Tan, Zhen; Sharma, Gaurav; Mathews, David H (2017) Modeling RNA Secondary Structure with Sequence Comparison and Experimental Mapping Data. Biophys J 113:330-338
Fu, Yinghan; Sharma, Gaurav; Mathews, David H (2014) Dynalign II: common secondary structure prediction for RNA homologs with domain insertions. Nucleic Acids Res 42:13939-48
Piekna-Przybylska, Dorota; Sullivan, Mark A; Sharma, Gaurav et al. (2014) U3 region in the HIV-1 genome adopts a G-quadruplex structure in its RNA and DNA sequence. Biochemistry 53:2581-93
Piekna-Przybylska, Dorota; Sharma, Gaurav; Bambara, Robert A (2013) Mechanism of HIV-1 RNA dimerization in the central region of the genome and significance for viral evolution. J Biol Chem 288:24140-50