We have updated our structurally unique data set of two-chain interfaces from the PDB; http://protein3d.ncifcrf.gov/keskino/].We clustered the interfaces based on their spatial structural similarities, regardless of the connectivity of their residues on the protein chains. The data set increased several fold from the one derived in 2004. Additional complexes have been included and classified from the structural protein database. This substantially more diverse data set reflects both the growth in the number of structures and in particular based on our statistics of the larger number of higher molecular weight proteins currently in the PDB. The comparison of the old and new data sets indicates that the number of newly found interface clusters has increased much more rapidly compared to the number of the available new PDB structures. This may suggest that the number of unique interfaces has still not reached its upper limit. We divided the clusters into three types: Type I clusters consist of similar interfaces whose parent chains are also similar. In Type II clusters, the interfaces are similar; however, the overall structures of the parent proteins from which the interfaces derive are different. In all Type II cases that we have studied, the clustered proteins belong to different SCOP families, with different functions. Type III category introduces clusters of interfaces where only one side of the interface is similar but the other side differs. Type III clusters illustrate that a binding site can interact with more than one chain, with different geometries, sizes, and composition. One of the paradigms in protein science states that similar global structures may have similar functions. Our observations suggest an extension of this paradigm: Similar interface architectures may have different functions. As in proteins' structures, evolution has reused """"""""good"""""""" favorable interface structural scaffolds and adapted them to diverse functions. The functions extend from enzymes/inhibitors to toxins and immunoglobulins. We did not observe homodimers in Type II clusters. This is probably due to the smaller sizes of the monomers and the extensive interfaces in the two-state homodimers that cover large portions of the chains. Our observation that globally different protein structures associate in similar ways to yield similar motifs, is interesting. Clearly, there is a very large number of ways that monomers can combinatorially assemble. Remarkably, among these there are preferred interface architectures and these are similar to those observed in monomers. This observation both underscores the view that the number of favorable motifs is limited in nature, and highlights the analogy between binding and folding. These have now been included in a routine to predict new interfaces, their mode of associations and consequently the protein function. We have shown that hot spots occur predominantly at the interfaces of macromolecular complexes, distinguishing binding sites from the remainder of the surface. Consequently, hot spots can be used to define binding epitopes. We have further shown a correspondence between energy hot spots and structurally conserved residues and proposed that conserved residues at the binding interfaces confer rigidity to minimize the entropic cost of binding, whereas surrounding residues form a flexible cushion. Furthermore, our finding that similar residue hot spots occur across different protein families suggests that affinity and specificity are not necessarily coupled: higher affinity does not directly imply greater specificity. Conservation of Trp on the protein surface indicates a highly likely binding site. To a lesser extent, conservation of Phe and Met also imply a binding site. For all three residues, there is a significant conservation in binding sites, whereas there is no conservation on the exposed surface. Using the dataset of protein-protein interfaces were are now developing a scheme to predict protein function directly from protein structures by mapping known interfaces onto the surfaces of these proteins. In addition, efforts are continuing in the prediction of the detrimer structure of the p53 and its functional dynamics when bound to the DNA. We have just now rationalized why the p53 recognition elements are highly preferred to occur without spacers (or with a spacer of one or two nucleotides) on the human genome, and shown it to correlate with p53 dimer-dimer and p53-DNA cooperativity. This should assist in devising algorithms for prediction of p53 recognition elements on the human genome. In experiment, p53 recognition elements are overwhelmingly without spacers.
Showing the most recent 10 out of 41 publications