Atomically detailed molecular modeling is a powerful tool for the prediction and design of protein interactions. In this project, we will develop molecular modeling techniques for simulating interactions between tandem repeat proteins and their binding partners, focusing on repeat proteins that recognize nucleic acids. Repeat proteins typically have highly symmetrical structures, with individual repeat units assuming similar conformations;in the case of repeat proteins that recognize a modular partner such as a nucleic acid or peptide, this symmetry may also extend to the conformation of the binding partner. Our modeling framework will incorporate the symmetry of repeat-protein interactions in order to constrain the space of conformations and protein sequences that must be explored. Tandem repeat proteins provide an architecture for protein interactions that has been used repeatedly throughout biological evolution to generate specific binding proteins. Armadillo, TPR, ankyrin, and HEAT repeat proteins have been selected for use in protein-protein interactions, while C2H2 zinc fingers, PUF, and PPR repeats are widely deployed for sequence-specific recognition of nucleic acids. Recently, a novel family of bacterial repeat proteins - the transcriptional activator-like (TAL) effectors - has been discovered that recognizes DNA in a remarkably modular fashion, with each repeat targeting a single base of the DNA binding site according to a simple recognition code. Using this recognition code, engineered TAL effectors can be efficiently targeted to novel DNA sites;this capability is transforming current approaches to genome engineering. To understand the molecular mechanisms underlying this recognition code, we performed molecular modeling simulations of TAL effector-DNA interactions that used structural symmetry and predicted protein-DNA contacts to reduce the space of possible bound conformations. Using these simulations we were able to generate accurate molecular models of TAL effector-DNA complexes;we subsequently used these models to solve, by molecular replacement, the first crystal structure of a naturally occurring TAL effector in complex with DNA. We propose to extend these simulations to allow prediction and design of a diverse range of tandem repeat-protein:nucleic acid interactions, with two specific applications: (1) optimization of the TAL effector platform for modular DNA sequence recognition, and (2) prediction of the RNA binding mode of pentatricopeptide repeat (PPR) proteins, a widespread family of RNA binding proteins that may represent a new and powerful ssRNA targeting scaffold.
This research will advance the prediction and design of protein:nucleic acid interactions mediated by tandem repeat proteins. Predictions of protein:DNA and protein:RNA interactions contribute to our understanding of biology and medicine, while design of novel protein:nucleic acid interactions can lead to new technologies for gene therapy and other disease treatments.