The investigators develop and implement parallel algorithms on the Connection Machine CM-5 for solving the protein folding problem faster and more accurately than is currently possible. In the inverse folding approach, the assumption that the sequence being analyzed adopts one of the already known protein folds is tested. The energy, or a score, of the new sequence forced to adopt structures of other proteins is calculated for all known protein structures. Structures that are not compatible with the sequence are dismissed. This approach is not a complete solution to the folding problem since it is possible for a sequence to adopt a novel, previously uncharacterized topology. Still, the number of examples where nonhomologous proteins adopt very similar structures is large and still growing. This and the extension of the inverse folding approach to the supersecondary structure elements provide a very general protein structure prediction method. The method is computationally very intensive. The investigators expect that the increase in performance from the use of appropriate parallel algorithms and architecture will allow them to reduce some approximations that are made in the current approach, and thus increase the utility and predictive power of the approach. The protein folding problem, i.e. predicting a protein's three-dimensional structure from its amino acid sequence, is a fundamental, unsolved problem in molecular biology. Kowledge of a protein's three-dimensional structure is necessary to understand its function and interactions with other agents. Inverse folding refers to a more tractable approach to the general problem of protein structure prediction. The inverse protein folding problem is the problem of determining whether a newly sequenced protein can fold into one of the known protein topologies. This problem has many applications in drug design and protein structure prediction. The investigators use high performance parallel comput ing technology to increase the performance and utility of this methodology. The algorithms and software packages resulting from this project will be available for commercialization and industrial applications.

Agency
National Science Foundation (NSF)
Institute
Division of Mathematical Sciences (DMS)
Type
Standard Grant (Standard)
Application #
9318771
Program Officer
Michael H. Steuerwalt
Project Start
Project End
Budget Start
1994-09-01
Budget End
1996-08-31
Support Year
Fiscal Year
1993
Total Cost
$47,200
Indirect Cost
Name
Thinking Machines Corporation
Department
Type
DUNS #
City
Cambridge
State
MA
Country
United States
Zip Code
02142