Joint Center for Molecular Modeling

Godzik, Adam

Abstract

New tools based on graph theory have revolutionized genome analyses, providing better ways to identify and classify rearrangements of genomic fragments. The same tools recently also provided a major breakthrough in multiple sequence alignment. Here we propose to apply these tools to protein structure analysis and use the resulting insights into protein structure evolution to increase model quality in comparative modeling. Structure comparison between distant homologs show clearly that the dominant paradigm in structure comparison, that a protein structure could be divided into an invariant core and flexible loops breaks down below 40%-50% sequence identity threshold. Instead, significant rearrangements can happen anywhere in the structure, with secondary structure elements undergoing significant shifts and movements. As a result, standard protocol in comparative modeling, based on sequence mounting on a rigid core structure, must fail for such homologs. Structural differences between homologs are driven, as is the entire folding process, by free energy of the system, but because of serious deficiencies in current force fields and computational approaches, energy-based predictions of such changes are not successful. In this grant we propose to improve the quality of comparative modeling by first discovering and then applying empirical rules of protein structure changes. Rapid growth of the number of known protein structures, fueled in part by technical advances in high throughput structure determination spearheaded by the Protein Structure Initiative, resulted in increasingly dense coverage of the structural space of many folds. This provides a rich learning base to discover such empirical rule, provided a right formalism to describe protein structure changes can be developed. In preliminary analyses we have shown that in a next approximation after the invariant core/flexible loops, protein structure can be described as built from rigid subdomains, and simple rearrangements of these subdomains account for almost half of the structural differences between distant homologs. Moreover, proteins can only adopt structures lying in a specific low dimensionality subspace of the entire conformational space. To improve the quality of models from comparative modeling, we plan to identify conserved subdomains for all known folds and to describe the allowed subspaces by analyzing already known structures from these folds. In the next step we will use this information to generate possible variants of the template structure and use model evaluation tools to identify the one most similar to the [sic].

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Exploratory Grants (P20)
Project #: 3P20GM076221-03S1
Application #: 7780895
Study Section: Special Emphasis Panel (ZGM1-CBB-3 (HM))
Program Officer: Smith, Ward

Project Start: 2006-04-01
Project End: 2010-10-31
Budget Start: 2009-04-01
Budget End: 2010-10-31
Support Year: 3
Fiscal Year: 2009
Total Cost: $651,131
Indirect Cost

Institution

Name: Sanford-Burnham Medical Research Institute
Department
Type
DUNS #: 020520466

City: La Jolla
State: CA
Country: United States
Zip Code: 92037

Related projects


NIH 2009 P20 GM	Joint Center for Molecular Modeling Godzik, Adam / Sanford-Burnham Medical Research Institute	$651,131
NIH 2008 P20 GM	Joint Center for Molecular Modeling Godzik, Adam / Sanford-Burnham Medical Research Institute	$651,131
NIH 2007 P20 GM	Joint Center for Molecular Modeling Godzik, Adam / Sanford-Burnham Medical Research Institute	$671,577
NIH 2006 P20 GM	Joint Center for Molecular Modeling Godzik, Adam / Sanford-Burnham Medical Research Institute	$700,000

Publications

Cai, Xiao-Hui; Jaroszewski, Lukasz; Wooley, John et al. (2011) Internal organization of large protein families: relationship between the sequence, structure, and function-based clustering. Proteins 79:2389-402

Zhang, Qing; Zmasek, Christian M; Cai, Xiaohui et al. (2011) TIR domain-containing adaptor SARM is a late addition to the ongoing microbe-host dialog. Dev Comp Immunol 35:461-8

Zakharov, Mikhail N; Pillai, Biju K; Bhasin, Shalender et al. (2011) Dynamics of coregulator-induced conformational perturbations in androgen receptor ligand binding domain. Mol Cell Endocrinol 341:1-8

Zmasek, Christian M; Godzik, Adam (2011) Strong functional patterns in the evolution of eukaryotic genomes revealed by the reconstruction of ancestral protein domain repertoires. Genome Biol 12:R4

Weekes, Dana; Krishna, S Sri; Bakolitsa, Constantina et al. (2010) TOPSAN: a collaborative annotation environment for structural genomics. BMC Bioinformatics 11:426

Zhang, Qing; Zmasek, Christian M; Godzik, Adam (2010) Domain architecture evolution of pattern-recognition receptors. Immunogenetics 62:263-72

Ellrott, Kyle; Jaroszewski, Lukasz; Li, Weizhong et al. (2010) Expansion of the protein repertoire in newly explored environments: human gut microbiome specific protein families. PLoS Comput Biol 6:e1000798

Zhang, Ying; Thiele, Ines; Weekes, Dana et al. (2009) Three-dimensional structural view of the central metabolic network of Thermotoga maritima. Science 325:1544-9

Istomin, Andrei Y; Godzik, Adam (2009) Understanding diversity of human innate immunity receptors: analysis of surface features of leucine-rich repeat domains in NLRs and TLRs. BMC Immunol 10:48

Verberkmoes, Nathan C; Russell, Alison L; Shah, Manesh et al. (2009) Shotgun metaproteomics of the human distal gut microbiota. ISME J 3:179-89

Showing the most recent 10 out of 21 publications

Comments

Be the first to comment on Adam Godzik's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: