TropshaOs group has made excellent progress with a novel approach in the area of protein folding, which applies methods of statistical geometry to the analysis and prediction of protein structure from primary sequence. In the past year, this work has led to new algorithms for sequence-structure compatibility (fold recognition) searches in multi-dimensional sequence-structure space. Individual amino acid residues in protein structures are represented in this work by their Ca atoms; thus each protein is described as a collection of points in three-dimensional space. Delaunay tessellation of this set of points generates an aggregate of space-filling, irregular tetrahedra, or Delaunay simplices. Statistical analysis of residue compositions of all Delaunay simplices in a representative dataset of protein structures have produced a four-body residue contact potential (expressed as log likelihood factor q) for the full set of 20 amino acids as well as for several reduced sets. Two independent sequence-structure compatibility (threading) functions have been defined in terms of these q factors: 1) the sum of q factors for all Delaunay simplices in a given protein, and 2) Delaunay tessellation profiles where the sum of q factors for all simplices that share the vertex residue is plotted as a function of residue number. Both threading functions were used as criteria to answer the two questions, """"""""does structure recognize sequence?"""""""" and """"""""does sequence recognize structure?"""""""" We find that threading functions based on either a profile or a total score can distinguish the native fold from incorrect folds for a given sequence, and the native sequence from non-native sequences for a given fold. Figure 1. Comparison of 3D-1D Delaunay tessellation profiles for native and deliberately misfolded structures. Of the two three-letter abbreviations in each legend, the first corresponds to the PDB code for the sequence and the next corresponds to the PDB code for the structure. In each case, the """"""""self"""""""" matching produces a profile with higher scores, indicating that the interioirs of these molecules are """"""""packed"""""""" according """"""""rules"""""""" followed also by other proteins. These protocols can effectively deal with nongapped threading and find an immediate application in selecting the most plausible conformation among alternative structures predicted on the basis of homology model building or molecular simulations. The Figure illustrates the efficiency of Delaunay tessellation profiles in discriminating between native and deliberately misfolded structures, obtained by assigning a native sequence to a protein structure with the same sequence length. membranes. (M. Berkowitz, U. Essmann).

Agency
National Institute of Health (NIH)
Institute
National Center for Research Resources (NCRR)
Type
Biotechnology Resource Grants (P41)
Project #
5P41RR008102-04
Application #
5225680
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
4
Fiscal Year
1996
Total Cost
Indirect Cost