The Human Genome Project and related efforts have generated enormous amounts of raw biological sequence data. However, understanding how biological sequences encode structural and functional information remains a fundamental scientific challenge. In particular, the information encoded in RNA viral genomes extends well beyond their protein coding role to the role of intra-sequence base pairing in viral packaging, replication, and gene expression. Thus, deciphering the different levels of information encoded in these sequences is essential for a full understanding of structure-function relationships in RNA viruses. Our goal is understanding how secondary structure information, expressed as the selective formation of base pairs, is encoded in large RNA viral genomes. Since current prediction methods cannot reliably and efficiently treat these lengthy sequences, we are developing novel combinatorial and computational approaches to the analysis, prediction, and design of viral RNA secondary structures. The outcomes of our research will be a discrete mathematical model of RNA folding and high-performance combinatorial algorithms for predicting secondary structures for large RNA molecules. The success of our methods for unenveloped icosahedral RNA viruses would extend to other large RNA molecules and have important implications for the prevention and treatment of numerous RNA-related diseases. Our research addresses 3 specific aims. (1) We will identify and evaluate characteristics of RNA secondary structures which differentiate base pairings that encode significant structural and functional information from those which are not well-determined. By refining our combinatorial model of RNA folding, we will distinguish configurations whose folding follows natural energy minima from base pairings that encode well-determined, and likely functionally significant, substructures. (2) We will predict new structures by developing the mathematical framework and computational techniques needed to construct a low-energy RNA secondary structure from minimal free energy substructures. By exploiting parallel and multicore processors, our novel approach will predict important functional motifs in the secondary structures of large RNA molecules with a greater degree of accuracy. (3) We will compare the compatibility of our predicted secondary structures with experimental information on RNA viruses using three-dimensional molecular modeling methods. These complimentary approaches will be used iteratively to arrive at a final model, and to design experimentally testable hypotheses.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1-CBCB-5 (BM))
Program Officer
Lewis, Catherine D
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Georgia Institute of Technology
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Rogers, Emily; Heitsch, Christine (2016) New insights from cluster analysis methods for RNA secondary structure prediction. Wiley Interdiscip Rev RNA 7:278-94
Rogers, Emily; Heitsch, Christine E (2014) Profiling small RNA reveals multimodal substructural signals in a Boltzmann ensemble. Nucleic Acids Res 42:e171
Poznanovi?, Svetlana; Heitsch, Christine E (2014) Asymptotic distribution of motifs in a stochastic context-free grammar model of RNA folding. J Math Biol 69:1743-72
Sukosd, Zsuzsanna; Swenson, M Shel; Kjems, Jorgen et al. (2013) Evaluating the accuracy of SHAPE-directed RNA secondary structure predictions. Nucleic Acids Res 41:2807-16
Harvey, Stephen C; Zeng, Yingying; Heitsch, Christine E (2013) The icosahedral RNA virus as a grotto: organizing the genome into stalagmites and stalactites. J Biol Phys 39:163-72
Zeng, Yingying; Larson, Steven B; Heitsch, Christine E et al. (2012) A model for the structure of satellite tobacco mosaic virus. J Struct Biol 180:110-6
Swenson, M Shel; Anderson, Joshua; Ash, Andrew et al. (2012) GTfold: enabling parallel RNA secondary structure prediction on multi-core desktops. BMC Res Notes 5:341
Hower, Valerie; Heitsch, Christine E (2011) Parametric analysis of RNA branching configurations. Bull Math Biol 73:754-76
Bakhtin, Yuri; Heitsch, Christine E (2009) Large deviations for random trees and the branching of RNA secondary structures. Bull Math Biol 71:84-106
Bader, David A; Aluru, Srinivas (2008) Guest Editors'Introduction, Special Issue on High-Performance Computational Biology. Parallel Comput 34:613-615

Showing the most recent 10 out of 11 publications