Structural biology provided us with tens of thousands of examples of protein three- dimensional structures, which gave molecular level information about many important biological processes. Coordinates of all these structures are deposited to a Protein Data Bank, where they are available for downloads and further analysis. Most of PDB users are interested in specific proteins relevant to a particular biological problem, but the same data can also be analyzed not for specific features of individual proteins, but to identify empiricl rules describing protein structures, which in turn can be used in protein structure predictions and simulations. The latter approach was used successfully to derive empirical potentials and rules that are used in programs such as Rosetta. However, a significant amount of information contained in the data available in the PDB is still untapped, mostly because of lack of adequate software tools. Here we propose to mine the PDB for information contained in structures of closely related or even identical proteins. We argue, that such cases, removed as "redundant" in most extant analyses of protein structures, provide unique, even that indirect, information about protein flexibility and specifically, ways and directions in which various protein folds change the structure in response to mutations (evolutionary flexibility), or to activation or ligand binding (functional flexibility). As a result of our analysis, we anticipate deriving empirical rules about protein structural changes, rules that can be applied to speed up and/or direct simulations and can be used to predict conformations of proteins in functional states unavailable from the direct experiment. The main deliverable of this proposal would be a "flexible comparative modeling" toolkit, a series of algorithms and protocols which would allow the users to apply empirical rules of protein structure changes to their protein of interest. This toolkit would be available from the project website as a server, but also would be distributed as an open source software package.

Public Health Relevance

In this project we propose to develop a flexible comparative modeling toolkit, program and server that would provide a simple and fast way to predict structures of proteins in other functional states, thus, providing a fast and cheap alternative to energy based extrapolations and refinements and contributing to faster drug development and easier functional analysis of important protein targets.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Smith, Ward
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Sanford-Burnham Medical Research Institute
La Jolla
United States
Zip Code
Hrabe, Thomas; Godzik, Adam (2014) ConSole: using modularity of contact maps to locate solenoid domains in protein structures. BMC Bioinformatics 15:119
Trame, Christine B; Chang, Yuanyuan; Axelrod, Herbert L et al. (2014) New mini- zincin structures provide a minimal scaffold for members of this metallopeptidase superfamily. BMC Bioinformatics 15:1
Li, Zhanwen; Natarajan, Padmaja; Ye, Yuzhen et al. (2014) POSA: a user-driven, interactive multiple protein structure alignment server. Nucleic Acids Res 42:W240-5
Das, Debanu; Murzin, Alexey G; Rawlings, Neil D et al. (2014) Structure and computational analysis of a novel protein with metallopeptidase-like and circularly permuted winged-helix-turn-helix domains reveals a possible role in modified polysaccharide biosynthesis. BMC Bioinformatics 15:75
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2014) FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics 30:660-7
Eroshkin, Alexey M; LeBlanc, Andrew; Weekes, Dana et al. (2014) bNAber: database of broadly neutralizing HIV antibodies. Nucleic Acids Res 42:D1133-9
Kuraku, Shigehiro; Zmasek, Christian M; Nishimura, Osamu et al. (2013) aLeaves facilitates on-demand exploration of metazoan gene family trees on MAFFT sequence alignment server with enhanced interactivity. Nucleic Acids Res 41:W22-8
Coggill, Penelope; Eberhardt, Ruth Y; Finn, Robert D et al. (2013) Two Pfam protein families characterized by a crystal structure of protein lpg2210 from Legionella pneumophila. BMC Bioinformatics 14:265
Chang, Roger L; Andrews, Kathleen; Kim, Donghyuk et al. (2013) Structural systems biology evaluation of metabolic thermotolerance in Escherichia coli. Science 340:1220-3
Bhabha, Gira; Ekiert, Damian C; Jennewein, Madeleine et al. (2013) Divergent evolution of protein conformational dynamics in dihydrofolate reductase. Nat Struct Mol Biol 20:1243-9