Projects I and II address the need of structural genomics projects to have access to improved methods for remote homology detection and improved alignment algorithms for structure-structure, sequence- structure, and sequence-sequence analysis. Collectively, these methods will contribute: (i) target selection: which sequences are most likely to provide a fold not seen before; (ii) functional annotation through recognition of similarities to existing correctly annotated sequences of structures; (iii) provision of useful structural models from sequences where no experimental structure is available. Alignments for Project I are based on statistical mechanical models. This algorithm uses recursion relationships developed from a partition function formulation of alignment probabilities. In the case of structure- structure alignments, the algorithm uses simple partition functions from polymer physicals and essentially provides a physical theory of structural alignment. It is implemented within a dynamic programming format that closely resembles the """"""""forward algorithm"""""""" commonly used in hidden Markov alignment path and will be referred to as the SDP algorithm. The methods in Project II are based upon a Bayesian network model which combines elements of the three-dimensional profile approach to fold recognition with a hidden Markov model formalism. This allows the use of the forward algorithm to sum the probabilities of all possible sequence-to-structure model alignments rather than relying on the optimal or most probably sequence-structure alignment as produced by the dynamic programming algorithms used in the traditional 3-D profile alignment approach. The Bayesian network models incorporate both primary sequence and structural information useful in the recognition of remote homologs for which structures already exist and will provide alternative methods for the fold and superfamily classification of newly determined structures.
Showing the most recent 10 out of 55 publications