Interactions between proteins and DNA molecules are central to a wide range of genetic and regulatory processes. This research project will lead to improved tools for computationally predicting protein-DNA structures and interactions using only protein sequence data (such as that generated by the human genome project). We will develop methods for predicting the three-dimensional structures of complexes between proteins and DNA molecules using the experimentally determined structures of related proteins. Our novel contribution will be the development of simulation techniques that can move the three-dimensional structure of the related protein in complex with DNA closer to the structure of the protein of interest bound to DNA. The lack of existing techniques that can achieve this task is widely recognized as a major impediment to the widespread transfer of structural information between proteins. Successful completion of this component of the project would greatly increase the impact of the ongoing structural genomics projects - which aim to determine by experimental means the structures of a representative set of proteins - by allowing high-resolution structural data to be used to understand the structures and functions of uncharacterized proteins. The goal of the second component of this project is to use the structural models we generate to make concrete predictions about the biological functions of the proteins in question. More precisely, we propose to develop methods that will predict which specific sequences of DNA a given protein will bind to. We will go about this by building structural models of the protein in complex with a variety of DNA sequences and evaluating its affinity for these different sequences using advanced force fields. Success will hinge on the accuracy of the force fields we use to calculate the energies of interaction between the protein and its potential partners. As an additional complication, the structures of these DNA molecules may all be slightly different, and the protein itself may adapt structurally to different sequences in different ways to achieve an optimal fit. In the final component of this project, we will apply these methods to study a specific protein of great biological importance. This protein, MyoD, has been called a 'master-regulator'of skeletal muscle development for its remarkable ability to turn cells of a variety of types into muscle cells. We will investigate the mechanisms by which MyoD performs its critical functions during development by building structural models of the interactions between MyoD and specific sites in the genome. These models will include partner molecules that help to target MyoD to biologically relevant sites. The eventual goal of these studies will be to predict the sites in the genome at which MyoD and other key regulatory proteins exert their effect.

Public Health Relevance

This research will lead to an improved understanding of the fundamental regulatory processes by which our genes determine our physical characteristics, among them disease susceptibilities. The software tools developed in this project will be widely used to answer important questions about how proteins interact with DNA molecules in order to properly regulate our cellular processes.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM088277-03
Application #
8118972
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter C
Project Start
2009-08-10
Project End
2014-07-31
Budget Start
2011-08-01
Budget End
2012-07-31
Support Year
3
Fiscal Year
2011
Total Cost
$326,021
Indirect Cost
Name
Fred Hutchinson Cancer Research Center
Department
Type
DUNS #
078200995
City
Seattle
State
WA
Country
United States
Zip Code
98109
Ilagan, Janine O; Ramakrishnan, Aravind; Hayes, Brian et al. (2015) U2AF1 mutations alter splice site recognition in hematological malignancies. Genome Res 25:14-26
O'Meara, Matthew J; Leaver-Fay, Andrew; Tyka, Michael D et al. (2015) Combined covalent-electrostatic model of hydrogen bonding improves structure prediction with Rosetta. J Chem Theory Comput 11:609-22
Joyce, Adam P; Zhang, Chi; Bradley, Philip et al. (2015) Structure-based modeling of protein: DNA specificity. Brief Funct Genomics 14:39-49
Thyme, Summer B; Song, Yifan; Brunette, T J et al. (2014) Massively parallel determination and modeling of endonuclease substrate specificity. Nucleic Acids Res 42:13839-52
Li, Shen; Bradley, Philip (2013) Probing the role of interfacial waters in protein-DNA recognition using a hybrid implicit/explicit solvation model. Proteins 81:1318-29
Mak, Amanda Nga-Sze; Bradley, Philip; Bogdanove, Adam J et al. (2013) TAL effectors: function, structure, engineering and applications. Curr Opin Struct Biol 23:93-9
Doyle, Erin L; Hummel, Aaron W; Demorest, Zachary L et al. (2013) TAL effector specificity for base 0 of the DNA target is altered in a complex, effector- and assay-dependent manner by substitutions for the tryptophan in cryptic repeat -1. PLoS One 8:e82120
Bradley, Philip (2012) Structural modeling of TAL effector-DNA interactions. Protein Sci 21:471-4
Christian, Michelle L; Demorest, Zachary L; Starker, Colby G et al. (2012) Targeting G with TAL effectors: a comparison of activities of TALENs constructed with NN and NK repeat variable di-residues. PLoS One 7:e45383
Yusuf, Dimas; Butland, Stefanie L; Swanson, Magdalena I et al. (2012) The transcription factor encyclopedia. Genome Biol 13:R24

Showing the most recent 10 out of 14 publications