Interactions between proteins and DNA molecules are central to a wide range of genetic and regulatory processes. This research project will lead to improved tools for computationally predicting protein-DNA structures and interactions using only protein sequence data (such as that generated by the human genome project). We will develop methods for predicting the three-dimensional structures of complexes between proteins and DNA molecules using the experimentally determined structures of related proteins. Our novel contribution will be the development of simulation techniques that can move the three-dimensional structure of the related protein in complex with DNA closer to the structure of the protein of interest bound to DNA. The lack of existing techniques that can achieve this task is widely recognized as a major impediment to the widespread transfer of structural information between proteins. Successful completion of this component of the project would greatly increase the impact of the ongoing structural genomics projects - which aim to determine by experimental means the structures of a representative set of proteins - by allowing high-resolution structural data to be used to understand the structures and functions of uncharacterized proteins. The goal of the second component of this project is to use the structural models we generate to make concrete predictions about the biological functions of the proteins in question. More precisely, we propose to develop methods that will predict which specific sequences of DNA a given protein will bind to. We will go about this by building structural models of the protein in complex with a variety of DNA sequences and evaluating its affinity for these different sequences using advanced force fields. Success will hinge on the accuracy of the force fields we use to calculate the energies of interaction between the protein and its potential partners. As an additional complication, the structures of these DNA molecules may all be slightly different, and the protein itself may adapt structurally to different sequences in different ways to achieve an optimal fit. In the final component of this project, we will apply these methods to study a specific protein of great biological importance. This protein, MyoD, has been called a 'master-regulator'of skeletal muscle development for its remarkable ability to turn cells of a variety of types into muscle cells. We will investigate the mechanisms by which MyoD performs its critical functions during development by building structural models of the interactions between MyoD and specific sites in the genome. These models will include partner molecules that help to target MyoD to biologically relevant sites. The eventual goal of these studies will be to predict the sites in the genome at which MyoD and other key regulatory proteins exert their effect.

Public Health Relevance

This research will lead to an improved understanding of the fundamental regulatory processes by which our genes determine our physical characteristics, among them disease susceptibilities. The software tools developed in this project will be widely used to answer important questions about how proteins interact with DNA molecules in order to properly regulate our cellular processes.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter C
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Fred Hutchinson Cancer Research Center
United States
Zip Code
Ilagan, Janine O; Ramakrishnan, Aravind; Hayes, Brian et al. (2015) U2AF1 mutations alter splice site recognition in hematological malignancies. Genome Res 25:14-26
O'Meara, Matthew J; Leaver-Fay, Andrew; Tyka, Michael D et al. (2015) Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with Rosetta. J Chem Theory Comput 11:609-22
Joyce, Adam P; Zhang, Chi; Bradley, Philip et al. (2015) Structure-based modeling of protein: DNA specificity. Brief Funct Genomics 14:39-49
Thyme, Summer B; Song, Yifan; Brunette, T J et al. (2014) Massively parallel determination and modeling of endonuclease substrate specificity. Nucleic Acids Res 42:13839-52
Doyle, Erin L; Hummel, Aaron W; Demorest, Zachary L et al. (2013) TAL effector specificity for base 0 of the DNA target is altered in a complex, effector- and assay-dependent manner by substitutions for the tryptophan in cryptic repeat -1. PLoS One 8:e82120
Li, Shen; Bradley, Philip (2013) Probing the role of interfacial waters in protein-DNA recognition using a hybrid implicit/explicit solvation model. Proteins 81:1318-29
Mak, Amanda Nga-Sze; Bradley, Philip; Bogdanove, Adam J et al. (2013) TAL effectors: function, structure, engineering and applications. Curr Opin Struct Biol 23:93-9
Liu, Limin Angela; Bradley, Philip (2012) Atomistic modeling of protein-DNA interaction specificity: progress and applications. Curr Opin Struct Biol 22:397-405
Bradley, Philip (2012) Structural modeling of TAL effector-DNA interactions. Protein Sci 21:471-4
Christian, Michelle L; Demorest, Zachary L; Starker, Colby G et al. (2012) Targeting G with TAL effectors: a comparison of activities of TALENs constructed with NN and NK repeat variable di-residues. PLoS One 7:e45383

Showing the most recent 10 out of 14 publications