We have developed algorithms for comparison and alignment of protein three dimensional structures. VAST (vector alignment search tool) identifies substructure similarities by comparing the types, connectivity, and relative orientations of SSE's (secondary structure elements). Surprising similarities are identified objectively, by considering the number and scores of superimposable SSE-pairs in the best alignment, and the number of alternative alignments sampled. An optimal residue-by-residue alignments are also identified objectively, as that with the most surprising combination of superposition residual and number of aligned residues. ? ? Work has focused in three areas. The first is construction of an automated incremental update system, to maintain an all-against-all database of the """"""""structure neighbor"""""""" relationships among domain structures in the public database. The VAST neighbor database now contains over 120 million structural superpositions and alignments. ? ? The second area is construction of an """"""""on-the-fly"""""""" structure neighbor server, which is now in use by structural biologists. This server allows them to transmit confidential coordinate data for VAST comparison against the public structure database, to identify possible remote homologs and to map features from one family member to another. Recently we implemented a number of major improvements to the structure neighbor server. The user interface for job submissions has been redesigned to make it easier and more intuitive to use. The back-end processing has also undergone major revisions, in switching from a """"""""file-based"""""""" system to using database servers. The various programs have been ported to much faster machines, so that the run-time for a typical job is now about 5 times faster than previously. The user now has a cleaner, easier-to-use interface and will obtain structure neighbor search results much more quickly. ? ? The third area of work has been a research project aimed at distinguishing structure neighbors that are related by descent from a common ancestral gene from those that are related by convergence to energetically preferred folding motifs. We have shown that homologs and """"""""analogs"""""""", as they have been called, may be better distinguished by a test for the HCS (Homologous Core Structure), the substructure conserved among previously identified homologs, than by any measures proposed earlier. Research is in progress to fully automate HCS calculations, and to construct multiple structure alignments of homologous protein domains as clustered by HCS overlap. This research has led to a new graphical display system for structure neighbors, that allows conservation of regions contributing to the HCS to be identified visually. ? ? Another continuing project involves identification of conserved interaction or docking modes. Pairs of structural domains that interact with one another are compared, to determine whether the interacting domains are structurally similar, and to measuer how similar are their relative docking orientation. We intend to investigate whether interaction mode similarity scores can be used in clustering, to identify evolutionarily conserved interaction modes.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000057-13
Application #
7316235
Study Section
(CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
13
Fiscal Year
2006
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Shoemaker, Benjamin A; Panchenko, Anna R; Bryant, Stephen H (2006) Finding biologically relevant protein domain interactions: conserved binding mode analysis. Protein Sci 15:352-61
Panchenko, Anna R; Wolf, Yuri I; Panchenko, Larisa A et al. (2005) Evolutionary plasticity of protein families: coupling between sequence and structure variation. Proteins 61:535-44
Panchenko, Anna R; Madej, Thomas (2004) Analysis of protein homology by assessing the (dis)similarity in protein loop regions. Proteins 57:539-47
Chen, Jie; Anderson, John B; DeWeese-Scott, Carol et al. (2003) MMDB: Entrez's 3D-structure database. Nucleic Acids Res 31:474-7
Wang, Yanli; Anderson, John B; Chen, Jie et al. (2002) MMDB: Entrez's 3D-structure database. Nucleic Acids Res 30:249-52
Marchler-Bauer, Aron; Panchenko, Anna R; Ariel, Naomi et al. (2002) Comparison of sequence and structure alignments for protein domains. Proteins 48:439-46
Wang, Y; Addess, K J; Geer, L et al. (2000) MMDB: 3D structure data in Entrez. Nucleic Acids Res 28:243-5