Structural biology provided us with tens of thousands of examples of protein three- dimensional structures, which gave molecular level information about many important biological processes. Coordinates of all these structures are deposited to a Protein Data Bank, where they are available for downloads and further analysis. Most of PDB users are interested in specific proteins relevant to a particular biological problem, but the same data can also be analyzed not for specific features of individual proteins, but to identify empiricl rules describing protein structures, which in turn can be used in protein structure predictions and simulations. The latter approach was used successfully to derive empirical potentials and rules that are used in programs such as Rosetta. However, a significant amount of information contained in the data available in the PDB is still untapped, mostly because of lack of adequate software tools. Here we propose to mine the PDB for information contained in structures of closely related or even identical proteins. We argue, that such cases, removed as redundant in most extant analyses of protein structures, provide unique, even that indirect, information about protein flexibility and specifically, ways and directions in which various protein folds change the structure in response to mutations (evolutionary flexibility), or to activation or ligand binding (functional flexibility). As a result of our analysis, we anticipate deriving empirical rules about protein structural changes, rules that can be applied to speed up and/or direct simulations and can be used to predict conformations of proteins in functional states unavailable from the direct experiment. The main deliverable of this proposal would be a flexible comparative modeling toolkit, a series of algorithms and protocols which would allow the users to apply empirical rules of protein structure changes to their protein of interest. This toolkit would be available from the project website as a server, but also would be distributed as an open source software package.

Public Health Relevance

In this project we propose to develop a flexible comparative modeling toolkit, program and server that would provide a simple and fast way to predict structures of proteins in other functional states, thus, providing a fast and cheap alternative to energy based extrapolations and refinements and contributing to faster drug development and easier functional analysis of important protein targets.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM101457-04
Application #
8836556
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Smith, Ward
Project Start
2012-05-01
Project End
2016-04-30
Budget Start
2015-05-01
Budget End
2016-04-30
Support Year
4
Fiscal Year
2015
Total Cost
$370,500
Indirect Cost
$180,500
Name
Sanford-Burnham Medical Research Institute
Department
Type
DUNS #
020520466
City
La Jolla
State
CA
Country
United States
Zip Code
92037
Braten, Ori; Livneh, Ido; Ziv, Tamar et al. (2016) Numerous proteins with unique characteristics are degraded by the 26S proteasome following monoubiquitination. Proc Natl Acad Sci U S A 113:E4639-47
Hrabe, Thomas; Li, Zhanwen; Sedova, Mayya et al. (2016) PDBFlex: exploring flexibility in protein structures. Nucleic Acids Res 44:D423-8
Hrabe, Thomas; Jaroszewski, Lukasz; Godzik, Adam (2016) Revealing aperiodic aspects of solenoid proteins from sequence information. Bioinformatics 32:2776-82
Porta-Pardo, Eduard; Garcia-Alonso, Luz; Hrabe, Thomas et al. (2015) A Pan-Cancer Catalogue of Cancer Driver Protein Interaction Interfaces. PLoS Comput Biol 11:e1004518
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2015) AIDA: ab initio domain assembly for automated multi-domain protein structure prediction and domain-domain interaction prediction. Bioinformatics 31:2098-105
Porta-Pardo, Eduard; Hrabe, Thomas; Godzik, Adam (2015) Cancer3D: understanding cancer mutations through protein structures. Nucleic Acids Res 43:D968-73
Li, Zhanwen; Natarajan, Padmaja; Ye, Yuzhen et al. (2014) POSA: a user-driven, interactive multiple protein structure alignment server. Nucleic Acids Res 42:W240-5
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2014) AIDA: ab initio domain assembly server. Nucleic Acids Res 42:W308-13
Eroshkin, Alexey M; LeBlanc, Andrew; Weekes, Dana et al. (2014) bNAber: database of broadly neutralizing HIV antibodies. Nucleic Acids Res 42:D1133-9
Xu, Dong; Jaroszewski, Lukasz; Li, Zhanwen et al. (2014) FFAS-3D: improving fold recognition by including optimized structural features and template re-ranking. Bioinformatics 30:660-7

Showing the most recent 10 out of 19 publications