The problem of how to determine the three dimensional structure of the protein coded by a primary peptide sequence is considered one of the most important problems in biophysics with implications in many areas of molecular biology. The search for algorithms to accomplish this task has generally involved three steps: 1) Determination of a model for the folding of a peptide chain in three dimensional space. 2) Determination of a correct Hamiltonian to describe the energy of interactions between the residues on the peptide chain as it is folded in three dimensional space. 3) Construction of an algorithm which will fold the peptide chain into the minimal energy configuration with reasonable computational cost. Perhaps the simplest Hamiltonian is the so-called random energy model(REM) that assumes the contacts between residues are random and the contact energy is distributed in a Gaussian manner. We consider a more complicated model in which the Hamiltonian is an energy matrix with specific interactions between residues specified and we allow contacts to have the dependencies that naturally arise because some will involve the same residue. We have been able to show that the energy spectrum is still Gaussian in the limit of realistic numbers of contacts as seen in proteins. Other properties are established as well and this work has been published. In a related development we have given a statistical method of computing the probability of a unique native state when the energy distribution from which the energies of the compact conformations are to be sampled is known. This method may be applied when the distribution is not Gaussian and when the number of contacts is a variable. A paper is in press. Further work is planned using the same methods, which should allow us to answer some questions on the effect of inaccuracies in the Hamiltonian matrix elements. One question that comes up in this work is the form of the tails of the energy distribution for a molecule, when the individual interaction energies are dictated by an energy matrix. As already noted we see that this should be approximately Gaussian. What can be said about the extreme tails, where folding is expected to occur? We have developed a method based on random walks with barriers which allows us to give an accurate picture of the extreme tails of the energy density for model molecules with energies determined by an energy matrix. The energy density is found to be approximately Gaussian out to roughly the native energy, but then deviates markedly from the Gaussian as one moves to lower energies. We have used the method to study the amount of information in different structural features of proteins such as secondary structure, tertiary contacts, and solvent accessibility. A paper has been prepared describing these results. We are applying the random walk method also to develop what we call a model energy distribution from which the energies of the compact states of a model peptide chain on a cubic lattice should be sampled. We then apply our statistical prediction methods to estimate the probability that a particular energy generated by the random walk will correspond to a native (dominant Boltzman probability) state. The sequence and structure pairs that correspond to native states are then used to test the recovery of the energy matrix from the contact data.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000065-02
Application #
2578638
Study Section
Special Emphasis Panel (CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
2
Fiscal Year
1996
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code