We have developed databases and software useful for comparative analysis of protein three-dimensional structure. These tools are distributed freely to biologists and developers of biotechnology software. MMDB (Molecular Modeling DataBase) is the 3D-structure component of the Entrez molecular biology retrieval system. MMDB is an ASN.1 database where all data items describing macromolecular structure are validated and explicitly listed, so that application software need not contain the complex logic required to retrieve this information from text formats such as PDB files. Work has concentrated on addition of accurate taxonomy assignments for macromolecular structures within MMDB, creation of new message and data types for transmission of structure-structure alignment data to local viewers, and on construction of an automated monthly update and indexing system, Pubstruct. CN3D (""""""""see in three dimensions"""""""") is a multi-structure visualization program distributed as part to the Entrez client software and in a stand-alone version lauchable via the MIME protocol in World-Wide-Web Entrez. The software differs from other public domain viewers in supporting display of multiple aligned structures from Entrez's """"""""structure neighbor"""""""" database, and in supporting simultaneous highlighting/picking of multiple sequence and multiple structure alignments. Other features added this year are on-the-fly alignment of the sequences of homologs, so that an Entrez user may easily map conserved sequence features onto the know 3D structure. These software features are intended to facilitate molecular biologist's identification of important structure-function relationships within protein families. Work this year has concentrated on improvements to CN3D. The software has been modified to use an industry-standard 3D graphics library, OpenGL, which provides much better quality molecular graphics rendering. We have also added core-structure alignment editing and threading tools to the sequence display windows, to support curation of CDD (a Conserved Domain Database). Work is in progress to revise and simplify the data structures underlying CN3D, so that further improvments in graphcis presentation, specific to describing conserved features in protein families, may be added to future versions of CN3D. A new version of Cn3D incorporating these changes was released in June, 2002 and downloaded by over 50,000 users as of October, 2002. This version provides sophisticated alignment editing tools, in addition to greatly improved molecular graphics performance on popular computing platforms. As of October, 2003, over 150,000 copies of CN3D have been downloaded. A new """"""""related structures"""""""" link has been added to NCBI BLAST servers, to provide easy-to use mapping to 3D structure whenever possible.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM000046-11
Application #
6843566
Study Section
(CBB)
Project Start
Project End
Budget Start
Budget End
Support Year
11
Fiscal Year
2003
Total Cost
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Marchler-Bauer, Aron; Anderson, John B; Chitsaz, Farideh et al. (2009) CDD: specific functional annotation with the Conserved Domain Database. Nucleic Acids Res 37:D205-10
Tyagi, Manoj; Shoemaker, Benjamin A; Bryant, Stephen H et al. (2009) Exploring functional roles of multibinding protein interfaces. Protein Sci 18:1674-83
Thompson, Kenneth Evan; Wang, Yanli; Madej, Tom et al. (2009) Improving protein structure similarity searches using domain boundaries based on conserved sequence information. BMC Struct Biol 9:33
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A et al. (2009) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 37:D5-15
Fong, Jessica H; Geer, Lewis Y; Panchenko, Anna R et al. (2007) Modeling the evolution of protein domain architectures using maximum parsimony. J Mol Biol 366:307-15
Marchler-Bauer, Aron; Anderson, John B; Derbyshire, Myra K et al. (2007) CDD: a conserved domain database for interactive domain family analysis. Nucleic Acids Res 35:D237-40
Madej, Thomas; Panchenko, Anna R; Chen, Jie et al. (2007) Protein homologous cores and loops: important clues to evolutionary relationships between structurally similar proteins. BMC Struct Biol 7:23
Wang, Yanli; Addess, Kenneth J; Chen, Jie et al. (2007) MMDB: annotating protein sequences with Entrez's 3D-structure database. Nucleic Acids Res 35:D298-300
Kann, Maricel G; Sheetlin, Sergey L; Park, Yonil et al. (2007) The identification of complete domains within protein sequences using accurate E-values for semi-global alignment. Nucleic Acids Res 35:4678-85
Chakrabarti, Saikat; Bryant, Stephen H; Panchenko, Anna R (2007) Functional specificity lies within the properties and evolutionary changes of amino acids. J Mol Biol 373:801-10

Showing the most recent 10 out of 19 publications