Determination of a protein?s three-dimensional structure is of critical importance in biology, providing insights to biological mechanisms and important targets for drug design. While high- resolution X-ray diffraction data provides an atomic view of cellular components, for many interesting and biologically relevant complexes, it may only be possible to obtain low-resolution structural information. Both cryo-electron microscopy and X-ray crystallography, when applied to large, flexible molecular machines, often produce data of 3-6 resolution. Extracting detailed atomic information from this data, critical in understanding function, the effects of mutation, or in designing drugs is impossible due to the low number of observations and the large conformational space proteins may adopt. I propose to develop computational methods for extracting high-resolution atomic models from this low-resolution data, bridging the ?resolution gap? with computational methods. My proposed research develops and extends our labs? methods for automatically inferring atomic accuracy models, from these ?near-atomic? resolution sources of experimental data. We develop novel conformational sampling methods, guided by experimental data, to infer atomic information both in cases where homologous high-resolution data is available, and where it is not. Additionally, we propose development of methods for estimating model uncertainty; these are critical in understanding to what degree structural conclusions may be made from a particular dataset. Finally, in pushing the resolution limit further, we develop general tools for biomolecular forcefield optimization. These machine-learning tools will allow development of a next-generation forcefield, critical in extending the resolution limit of data from which we can infer atomic details. The overall goal of the proposed research is robust and accessible methods to determine protein structures to atomic accuracy from only sparse experimental data. Combined, the three aims in this proposal will lead to dramatic improvements in our ability to infer atomic interactions from sparse experimental data. This will lead to determination of structures that will reveal key insights into how biomedically important protein complexes perform their function and what goes wrong in human disease.

Public Health Relevance

This project develops computational tools for accurately determining protein structure from low- resolution experimental data. The proposed work will lead to dramatic improvements in our ability to model dynamic structures from sparse data. Obtaining accurate structures from this data will reveal insights into how biomedically important protein complexes perform their function, and what goes wrong in human disease.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Flicker, Paula F
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Washington
Schools of Medicine
United States
Zip Code
Cianfrocco, Michael A; Lahiri, Indrajit; DiMaio, Frank et al. (2018) cryoem-cloud-tools: A software platform to deploy and manage cryo-EM jobs in the cloud. J Struct Biol 203:230-235
Park, Hahnbeom; Kim, David E; Ovchinnikov, Sergey et al. (2018) Automatic structure prediction of oligomeric assemblies using Robetta in CASP12. Proteins 86 Suppl 1:283-291
Sui, Xuewu; Arlt, Henning; Brock, Kelly P et al. (2018) Cryo-electron microscopy structure of the lipid droplet-formation protein seipin. J Cell Biol 217:4080-4091
Kellogg, Elizabeth H; Hejab, Nisreen M A; Poepsel, Simon et al. (2018) Near-atomic model of microtubule-tau interactions. Science 360:1242-1246
Usluer, G├╝lsima D; DiMaio, Frank; Yang, Shun Kai et al. (2018) Cryo-EM structure of the bacterial actin AlfA reveals unique assembly and ATP-binding interactions and the absence of a conserved subdomain. Proc Natl Acad Sci U S A 115:3356-3361
Park, Hahnbeom; Ovchinnikov, Sergey; Kim, David E et al. (2018) Protein homology model refinement by large-scale energy optimization. Proc Natl Acad Sci U S A 115:3054-3059
Frenz, Brandon; Walls, Alexandra C; Egelman, Edward H et al. (2017) RosettaES: a sampling strategy enabling automated interpretation of difficult cryo-EM maps. Nat Methods 14:797-800
Xu, Jun; Lahiri, Indrajit; Wang, Wei et al. (2017) Structural basis for the initiation of eukaryotic transcription-coupled DNA repair. Nature 551:653-657