Computational structure-based protein design is a transformative field with exciting prospects for advancing both basic science and translational medical research. My laboratory has developed new protein design algorithms and used them to design new drugs for leukemia, redesign an enzyme to diversify current antibiotics, design protein-peptide interactions to treat cystic fibrosis, design probes to isolate broadly neutralizin HIV antibodies, and predict MRSA resistance to new antibiotics. Central to protein design methodology is the need to optimize the amino acid sequence, placement of side chains, and backbone conformations in protein structures. By developing advanced search and scoring algorithms for combinatorial optimization of protein and ligand structure and sequence, we showed that desired structure, affinity, and activity can be designed by (a) modeling improved molecular flexibility and (b) exploiting ensembles of structures for accurate predictions. Our suit of algorithms has mathematical guarantees on the solution quality (up to the accuracy of the input model, which includes the initial structures, molecular flexibility to be modeled, and an empirical molecular mechanics energy function). Specifically, our algorithms guarantee to compute the global minimum energy conformation (GMEC), a gap-free list of sequences and structures in order of predicted energy, and a provably-good approximation to the binding affinity by bounding partition functions over molecular ensembles. We tested our algorithms prospectively, and experimental validation included construction of mutant proteins, measurement of binding affinity, enzyme kinetics and stability, crystal structures, NMR structures, viral neutralization, and in-cell activity. We propose to build on our foundation of protein design algorithms, called OSPREY, and apply them in areas of biochemical and pharmacological importance. We will (1) predict future resistance mutations in protein targets of novel drugs;(2) design inhibitors of protein:protein interactions to target today's """"""""undruggable"""""""" proteins;and (3) use our design methodology to discover and improve broadly neutralizing HIV-1 antibodies. Improvements to our protein design algorithms will be implemented to improve accuracy and scope, and we will advance the state-of-the-art in protein design by making algorithmic and modeling improvements to accomplish the Aims (1-3) above, including: the modeling of more protein and ligand flexibility during design;new combinatorial optimization and energy-bounding methods to accelerate the design search;and design of affinity and specificity using novel positive and negative design algorithms that model thermodynamic molecular ensembles. We will test our design predictions prospectively, by making novel predicted mutant proteins and performing biochemical, biological, and structural studies. We will also validate our algorithms retrospectively, using existing structures and data. All software we develop will be released open-source.

Public Health Relevance

We propose computational structure-based protein design algorithms that could revolutionize therapeutic treatment. Our algorithms will enable the design of proteins and other molecules to act on today's undruggable proteins and tomorrow's drug-resistant diseases. In the next grant period, we will develop novel protein design algorithms and software, and use them to (1) predict future resistance mutations to new drugs in pathogens responsible for deadly nosocomial and community-acquired infections: methicillin-resistant Staphylococcus aureus (MRSA), vancomycin-resistant Enterococcus (VRE), and Candida glabrata;(2) design inhibitors of protein:protein interactions that address the underlying genetic defect in cystic fibrosis patients and alleviate their symptoms;and (3) discover, improve, and design broadly neutralizing antibodies against Human immunodeficiency virus (HIV).

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Wehrle, Janna P
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Duke University
Biostatistics & Other Math Sci
Schools of Arts and Sciences
United States
Zip Code
Qi, Yang; Martin, Jeffrey W; Barb, Adam W et al. (2018) Continuous Interdomain Orientation Distributions Reveal Components of Binding Thermodynamics. J Mol Biol 430:3412-3426
Ojewole, Adegoke A; Jou, Jonathan D; Fowler, Vance G et al. (2018) BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces. J Comput Biol 25:726-739
Ojewole, Adegoke; Lowegard, Anna; Gainza, Pablo et al. (2017) OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design. Methods Mol Biol 1529:291-306
Jain, Swati; Jou, Jonathan D; Georgiev, Ivelin S et al. (2017) A critical analysis of computational protein design with sparse residue interaction graphs. PLoS Comput Biol 13:e1005346
Hallen, Mark A; Jou, Jonathan D; Donald, Bruce R (2017) LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid Rotamer-Like Efficiency. J Comput Biol 24:536-546
Hallen, Mark A; Donald, Bruce R (2017) CATS (Coordinates of Atoms by Taylor Series): protein design with backbone flexibility in all locally feasible directions. Bioinformatics 33:i5-i12
Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang (2017) Parallel Computational Protein Design. Methods Mol Biol 1529:265-277
Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S et al. (2016) BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design. J Comput Biol 23:413-24
Gainza, Pablo; Nisonoff, Hunter M; Donald, Bruce R (2016) Algorithms for protein design. Curr Opin Struct Biol 39:16-26
Hallen, Mark A; Donald, Bruce R (2016) comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Protein Design Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence. J Comput Biol 23:311-21

Showing the most recent 10 out of 35 publications