A detailed and accurate understanding of the structure of proteins is one cornerstone of modern biomedical research, and an explicit goal of the NIH is to define the structure of all proteins either by accurate experimental determination or comparative model-building. The most successful structure prediction approaches employ empirical knowledge-based energy terms derived from features of known protein structures - most notably single-residue ???-distributions, backbone-dependent side chain rotamer preferences, and tight packing criteria. One known unrealistic feature of these prediction programs is the assumption of a fixed ideal geometry for the backbone. The driving hypothesis behind this proposal is that there exists a largely unappreciated but real, systematic, significant and pervasive variation in backbone bond angles and peptide planarity that occurs as a function of backbone torsion angles, and accounting properly for this variation will be required to achieve X-ray crystal structure quality for comparative models. The overall goal of this work is to generate accurate empirical values for this covalent variation that will lead to tangible improvements in the accuracy of structures produced by comparative modeling and de novo structure prediction as well as by X-ray crystallography. We propose to achieve this overall goal by pursuing the following three specific aims: 1) to design, develop, and make available a flexibly-searchable database containing bond lengths, bond angles, and torsion angles for all structures known at better than 1.75 ? resolution (currently ~500,000 residues);2) to use conventional query-based and modern machine learning approaches to derive accurate empirical information from the database about the systematic correlation of local conformation with variations in covalent geometry;and 3) to create a modular conformation-dependent expected covalent geometry library and to facilitate its incorporation into leading applications for comparative and crystallographic protein structure modeling. With the dramatically increased number of ultrahigh-resolution resolution crystal structures now known, the time is ripe for construction of this Protein Geometry Database that will provide facile access to a massive treasure trove of reliable and detailed empirical information about protein structure. To be done well, this work will require painstaking attention to detail and an intimate familiarity with the limitations of crystallographic refinement and the principles of protein structure. Dr. Karplus is well-suited to lead this work as he has a 20+-year track record of quality crystallographic structure determinations combined with contributions of more general insights into protein structure, among them being the pioneering characterization of the conformation-dependent variations in covalent geometry that serves as this project's foundation. Collaborations with world-leading groups in structure prediction, in crystallographic refinement and structure validation, and in knowledge-based library development ensure a rapid and effective translation of the gleaned information into improvements in protein modeling.

Public Health Relevance

Proteins are responsible for carrying out most of the processes of life and their function depends exquisitely on their structure, even on the tiniest structural details. For this reason, determining accurate structures of proteins is a cornerstone of modern biomedical research. This work is aimed at leading to a universal improvement in the accuracy with which protein structure can be built.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM083136-04
Application #
8111114
Study Section
Special Emphasis Panel (ZRG1-CB-M (90))
Program Officer
Hagan, Ann A
Project Start
2008-08-01
Project End
2012-09-17
Budget Start
2011-08-01
Budget End
2012-09-17
Support Year
4
Fiscal Year
2011
Total Cost
$208,186
Indirect Cost
Name
Oregon State University
Department
Biochemistry
Type
Schools of Arts and Sciences
DUNS #
053599908
City
Corvallis
State
OR
Country
United States
Zip Code
97339
Brereton, Andrew E; Karplus, P Andrew (2018) Ensemblator v3: Robust atom-level comparative analyses and classification of protein structure ensembles. Protein Sci 27:41-50
Evangelidis, Thomas; Nerli, Santrupti; Nová?ek, Ji?í et al. (2018) Automated NMR resonance assignments and structure determination using a minimal set of 4D spectra. Nat Commun 9:384
Hollingsworth, Scott A; Lewis, Matthew C; Karplus, P Andrew (2016) Beyond basins: ?,? preferences of a residue depend heavily on the ?,? values of its neighbors. Protein Sci 25:1757-62
Sharaf, Naima G; Brereton, Andrew E; Byeon, In-Ja L et al. (2016) NMR structure of the HIV-1 reverse transcriptase thumb subdomain. J Biomol NMR 66:273-280
Moriarty, Nigel W; Tronrud, Dale E; Adams, Paul D et al. (2016) A new default restraint library for the protein backbone in Phenix: a conformation-dependent geometry goes mainstream. Acta Crystallogr D Struct Biol 72:176-9
Brereton, Andrew E; Karplus, P Andrew (2016) On the reliability of peptide nonplanarity seen in ultra-high resolution crystal structures. Protein Sci 25:926-32
Li, Wenlin; Kinch, Lisa N; Karplus, P Andrew et al. (2015) ChSeq: A database of chameleon sequences. Protein Sci 24:1075-86
Brereton, Andrew E; Karplus, P Andrew (2015) Native proteins trap high-energy transit conformations. Sci Adv 1:e1501188
Karplus, P Andrew; Diederichs, Kay (2015) Assessing and maximizing data quality in macromolecular crystallography. Curr Opin Struct Biol 34:60-8
Clark, Sarah A; Tronrud, Dale E; Karplus, P Andrew (2015) Residue-level global and local ensemble-ensemble comparisons of protein domains. Protein Sci 24:1528-42

Showing the most recent 10 out of 26 publications