. Most proteins are symmetric oligomeric complexes. Despite their prevalence and biomedical importance, such complexes are vastly underrepresented in the PDB, and determining their structures presents daunting challenges for NMR structural biologists. In particular, simulated annealing (SA), a widely-used technique for structure determination of homo-oligomers, is vulnerable to significant structural errors. Due to assignment ambiguity, SA converges to local minima rather than to the optimal structure or structural ensemble indicated by the data. Fold Operator Theory overcomes these errors, using a systematic search algorithm shown to identify biologically important assignments and structures that SA does not find. For example, the published NMR and crystal structures of the enzyme Diacylglycerol Kinase (DAGK) have very different topologies. Our systematic search techniques not only showed that both published folds are supported by the NMR data, but also found a novel fold that satisfies the data better than either published fold. We propose to develop novel algorithms and software enabling global and systematic search for NMR structure determination, building on our preliminary results showing that our methods can solve problems where traditional stochastic NMR methods struggle. These new tools will dramatically increase the accuracy of NMR structure determination with assignment ambiguity, which unavoidably arises for higher-order symmetric homo-oligomers. The proposed Deep Topological Sampling (DTS) has two primary modules: Fold Operator Theory (FOT); and DISCO (which we recently used to solve the structure of a membrane-associated MPER homo-trimer designed to probe immunogenic responses to the HIV-1 viral coat protein gp41).
Aim 1 : We will implement a general FOT in software, to compute all the protein folds consistent with the NMR data. FOT will search globally over folds, and avoid being trapped in local minima, to find all satisfying structures.
Aim 2 : We will develop our DISCO algorithm to search within each viable fold generated by FOT to find all feasible low-energy structures. DISCO and FOT will exploit novel geometric and topological algorithms to perform automated assignments accurately and efficiently, thus alleviating the most time-consuming and potentially error-prone step in multimeric structure determination.
Aim 3 : We will apply our FOT/DTS software (developed in Aims 1-2) prospectively to important systems. (A) We will perform experiments to determine the true functional structure DAGK adopts in its native environment. (B) We will use our methods to determine the structure of a larger HIV-1 membrane-associated pre-fusion gp41 trimer construct exposing transient, intermediate epitopes that bind broadly neutralizing antibodies, but are structurally invisible in larger laboratory constructs. (C) We will solve the hemifusion intermediate structures of the antigenic, symmetric homo- oligomeric domains of the Zika virus envelope protein, an emerging global health threat. Our novel open- source software can be applied to a vast array of symmetric protein targets in viral and bacterial pathogens.

Public Health Relevance

. Symmetric oligomeric protein complexes are ubiquitous, and are involved in essential cellular processes, making them ideal therapeutic targets. We will develop novel computational methods to determine the three-dimensional structures of these protein complexes using Nuclear Magnetic Resonance (NMR) measurements, which will enable mechanistic understanding and facilitate rational drug design. We will apply our methodology to (A) unambiguously determine the structure of diacylglycerol kinase (DAGK), a membrane-bound enzyme of bacterial lipid biosynthesis and potential therapeutic target, (B) interrogate the structure of the membrane-proximal region of Human Immunodeficiency Virus-1 (HIV-1) viral coat protein glycoprotein 41 (gp41), which is essential for viral entry and a target of HIV drugs and broadly neutralizing antibodies, and (C) solve the membrane-proximal hemifusion intermediate structures of the antigenic, symmetric homo-oligomeric domains of the Zika virus envelope (ZIKV E) protein, an emerging global health threat. Our novel open-source software can be applied to a vast array of symmetric protein targets in viral and bacterial pathogens.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM118543-02
Application #
9567184
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Wehrle, Janna P
Project Start
2017-09-18
Project End
2021-07-31
Budget Start
2018-08-01
Budget End
2019-07-31
Support Year
2
Fiscal Year
2018
Total Cost
Indirect Cost
Name
Duke University
Department
Biostatistics & Other Math Sci
Type
Schools of Arts and Sciences
DUNS #
044387793
City
Durham
State
NC
Country
United States
Zip Code
27705