Interactions between proteins and DNA molecules are central to a wide range of genetic and regulatory processes. This research project will lead to improved tools for computationally predicting protein-DNA structures and interactions using only protein sequence data (such as that generated by the human genome project). We will develop methods for predicting the three-dimensional structures of complexes between proteins and DNA molecules using the experimentally determined structures of related proteins. Our novel contribution will be the development of simulation techniques that can move the three-dimensional structure of the related protein in complex with DNA closer to the structure of the protein of interest bound to DNA. The lack of existing techniques that can achieve this task is widely recognized as a major impediment to the widespread transfer of structural information between proteins. Successful completion of this component of the project would greatly increase the impact of the ongoing structural genomics projects - which aim to determine by experimental means the structures of a representative set of proteins - by allowing high-resolution structural data to be used to understand the structures and functions of uncharacterized proteins. The goal of the second component of this project is to use the structural models we generate to make concrete predictions about the biological functions of the proteins in question. More precisely, we propose to develop methods that will predict which specific sequences of DNA a given protein will bind to. We will go about this by building structural models of the protein in complex with a variety of DNA sequences and evaluating its affinity for these different sequences using advanced force fields. Success will hinge on the accuracy of the force fields we use to calculate the energies of interaction between the protein and its potential partners. As an additional complication, the structures of these DNA molecules may all be slightly different, and the protein itself may adapt structurally to different sequences in different ways to achieve an optimal fit. In the final component of this project, we will apply these methods to study a specific protein of great biological importance. This protein, MyoD, has been called a 'master-regulator'of skeletal muscle development for its remarkable ability to turn cells of a variety of types into muscle cells. We will investigate the mechanisms by which MyoD performs its critical functions during development by building structural models of the interactions between MyoD and specific sites in the genome. These models will include partner molecules that help to target MyoD to biologically relevant sites. The eventual goal of these studies will be to predict the sites in the genome at which MyoD and other key regulatory proteins exert their effect.
This research will lead to an improved understanding of the fundamental regulatory processes by which our genes determine our physical characteristics, among them disease susceptibilities. The software tools developed in this project will be widely used to answer important questions about how proteins interact with DNA molecules in order to properly regulate our cellular processes.
Showing the most recent 10 out of 14 publications