The problem of determining the three dimensional structure of a globular protein molecule from its amino acid sequence has gained new importance with the explosion in the number of known coding sequences. At the same time, the finding that there are a finite number of different three dimensional structures, and the accumulation of examples of a substantial number of these through protein crystallography, has opened the way for determining new structures by the exploitation of relationships with known ones. We are developing a computer algorithm which carries out the most difficult step in such modeling: building parts of the new structure which are unrelated to any known one. That is, determining the conformation of short stretches of polypeptide chain, given an approximate model for the structure of the rest of the molecule. The two primary goals of the present project are to develop the scope and reliability of the algorithm as far as is possible, and to apply it to a range of modeling problems. Pursuing the first of these goals also involves analysis of a set of protein structures to further out understanding of what determines the observed conformations. The algorithm has two main stages: generating in a systematic way a set of possible conformations of the peptide segment under consideration, and evaluation of which of these conformations is correct by the use of empirical energy functions or compatibility with experimental data. The success of the method depends upon two key points: extracting from known protein structures a set of rules which can be used to restrict the number of conformations a chain may have, and on the use of appropriate criteria to choose one conformation close to the correct structure. Useful rules are: the preferred values of backbone and side chain dihedral angles, the avoidance of short interatomic contacts, the optimization of electrostatic interactions, and the minimization of solvent exposed hydrophobic area. In addition to its use in the determination of new protein structures, the algorithm is applicable in other areas of the study of protein structure and function, including the understanding of the conformational consequences of site directed mtuagenesis experiments, the design of segments of protein structure, and X- ray crystallographic refinement.
Showing the most recent 10 out of 19 publications