We investigate and develop statistical methods for analysis and prediction of protein structure from sequence information. Lately, we have sought to incorporate multiple-alignments of homologous sequences into our prediction methods. Such alignments can be utilized in many ways, and have yielded improvements in secondary structure prediction of up to 10%. We are seeking to explain these improvements in terms of sequence profiles, residue mutability, and the propensity to permit insertions and deletions in the sequence. Although it is possible to determine mutation-permissive locations in such alignments, it appears that the major improvements in prediction accuracy come from determining the residue profile and the gap-propensity, and not from the apparent mutability of particular residues. We have developed a novel condensed representatiion for protein secondary structure which permits summarization of the entire PDB (Protein Data Bank) on only a few sheets of paper. This has facilitated our attempts to organize and classify the PDB so that improved class-prediction may result in better structure prediction. We have attempted to predict reverse-turn classification from sequence, as this is hypothesized to be an important determinant of protein secondary structure. Using molecular-dynamics simulations, we have sought energy minimal structures for short residue fragments, and are in the process of correlating them with the relative frequency of occurence of such turns in the PDB. Our program MUSEQAL for multiple alignment of protein sequences, was implemented on a parallel architecture, using """"""""speculative computation"""""""" techniques. The properties of this new program have been extensively investigated, and reveal that in typical situations, as few as 70% of residue pairs may be aligned the same way by two different alignment programs, although both produce acceptable alignments. This finding highlights the unreliability of parts of many alignments. A new program KINFIT II was fully developed and tested, resulting in a new tool for ligand-receptor kinetic analysis. Advice and consultation were given to several groups at NIH for using this and the LIGAND and ALLFIT programs. Several hundred copies of these programs and documentation have been distributed so far.
Showing the most recent 10 out of 15 publications