We have developed algorithms for comparison and alignment of protein three dimensional structures. VAST (vector alignment search tool) identifies substructure similarities by comparing the types, connectivity, and relative orientations of SSE's (secondary structure elements). Surprising similarities are identified objectively, by considering the number and scores of superimposable SSE-pairs in the best alignment, and the number of alternative alignments sampled. An optimal residue-by-residue alignments are also identified objectively, as that with the most surprising combination of superposition residual and number of aligned residues. Work this year has focused in three areas. The first is construction of an automated incremental update system, to maintain an all-against-all database of the """"""""structure neighbor"""""""" relationships among domain structures in the public database. The VAST neighbor database now contains nearly 3 million structural superpositions alignments. The second area is construction of an """"""""on-the-fly"""""""" structure neighbor server, which is now in use by structural biologists. This server allows them to transmit confidential coordinate data for VAST comparison against the public structure database, to identify possible remote homologs and to map features from one family member to another. The third area of work this year has been a research project aimed at distinguishing structure neighbors that are related by descent from a common ancestral gene from those that are related by convergence to energetically preferred folding motifs. We have shown that homologs and """"""""analogs"""""""", as they have been called, may be better distinguished by a test for the HCS (Homologous Core Structure), the substructure conserved among previously identified homologs, than by any measures proposed earlier.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000057-05
Application #
6111066
Study Section
Special Emphasis Panel (CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
5
Fiscal Year
1998
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Shoemaker, Benjamin A; Panchenko, Anna R; Bryant, Stephen H (2006) Finding biologically relevant protein domain interactions: conserved binding mode analysis. Protein Sci 15:352-61
Panchenko, Anna R; Wolf, Yuri I; Panchenko, Larisa A et al. (2005) Evolutionary plasticity of protein families: coupling between sequence and structure variation. Proteins 61:535-44
Panchenko, Anna R; Madej, Thomas (2004) Analysis of protein homology by assessing the (dis)similarity in protein loop regions. Proteins 57:539-47
Chen, Jie; Anderson, John B; DeWeese-Scott, Carol et al. (2003) MMDB: Entrez's 3D-structure database. Nucleic Acids Res 31:474-7
Wang, Yanli; Anderson, John B; Chen, Jie et al. (2002) MMDB: Entrez's 3D-structure database. Nucleic Acids Res 30:249-52
Marchler-Bauer, Aron; Panchenko, Anna R; Ariel, Naomi et al. (2002) Comparison of sequence and structure alignments for protein domains. Proteins 48:439-46
Wang, Y; Addess, K J; Geer, L et al. (2000) MMDB: 3D structure data in Entrez. Nucleic Acids Res 28:243-5