Realization of novel molecular function requires the ability to alter molecular complex formation. Enzymatic function can be altered by changing enzyme-substrate interactions via modification of an enzyme's active site. We propose a new algorithm for protein redesign, which combines a statistical mechanics-derived ensemble-based approach to computing the binding constant with the speed and completeness of a branch-and-bound pruning algorithm. In addition, we propose an efficient, deterministic approximation algorithm, capable of approximating our scoring function to arbitrary precision. Our ensemble-based algorithm, which flexibly models both protein and ligand using rotamer-based partition functions, has application in enzyme redesign, the prediction of protein-ligand binding, and computer-aided drug design. In preliminary studies, we redesigned the phenylalanine-specific adenylation domain of the non-ribosomal peptide synthetase Gramicidin Synthetase A (NRPS GrsA-PheA). Ensemble scoring, using a rotameric approximation to the partition functions of the bound and unbound states for GrsA-PheA, was used to switch the enzyme specificity toward leucine (Leu) and tyrosine (Tyr), using novel active site sequences computationally predicted by searching through the space of possible active site mutations. The top-scoring in silico mutants were created in vitro, and binding and catalytic activity were measured. Several of the top-ranked mutations exhibit the desired change in specificity from Phe to Leu or Tyr. When considering protein flexibility and molecular ensembles for protein design, a major challenge has been the development of ensemble-based redesign algorithms that efficiently prune mutations and conformations. The proposed K* (""""""""K-star"""""""") method generalizes Boltzmann-based scoring to ensembles and applies the result to protein design. K prunes the vast majority of conformations, thereby reducing execution time and making a mutation search that considers both ligand and protein flexibility computationally feasible. In addition to redesigning PheA, the K algorithm will be used to reprogram the specificity of other NRPS domains, whose products include natural antibiotics, antifungals, antivirals, immuno- suppressants, and antineoplastics. We will also use our algorithms to redesign two restriction endonucleases (REs), and will apply K to design peptide inhibitors for the CAL (Cystic fibrosis transmembrane conductance regulator Associated Ligand) PDZ domain. Our algorithms will predict NRPS and RE mutants with putative novel function, and we will create the mutant proteins, and test our predictions by using biochemical activity assays and determining new crystal structures. We will test the predicted CAL-binding peptides both in vitro and in vivo. Project Narrative: Enzyme redesign provides a good test of our understanding of proteins. The long-term goal of our research is to develop novel algorithms to plan structure-based site-directed mutations to a protein's active site in order to modify its function. We will develop general planning software that can reprogram the specificity of many proteins, including NRPS domains, whose products include natural antibiotics, antifungals, antivirals, immuno- suppressants, and antineoplastics. These engineered enzymes should enable combinatorial biosynthesis of novel pharmacologically-active compounds, yielding new leads for drug design. The proposed application of our algorithms to restriction endonucleases and the CAL PDZ domain could lead to (respectively) biotechnology advances and novel therapeutic interventions for cystic fibrosis.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Hagan, Ann A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Duke University
Biostatistics & Other Math Sci
Other Domestic Higher Education
United States
Zip Code
Ojewole, Adegoke; Lowegard, Anna; Gainza, Pablo et al. (2017) OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design. Methods Mol Biol 1529:291-306
Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang (2017) Parallel Computational Protein Design. Methods Mol Biol 1529:265-277
Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S et al. (2016) BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design. J Comput Biol 23:413-24
Gainza, Pablo; Nisonoff, Hunter M; Donald, Bruce R (2016) Algorithms for protein design. Curr Opin Struct Biol 39:16-26
Hallen, Mark A; Donald, Bruce R (2016) comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Protein Design Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence. J Comput Biol 23:311-21
Pan, Yuchao; Dong, Yuxi; Zhou, Jingtian et al. (2016) cOSPREY: A Cloud-Based Distributed Algorithm for Large-Scale Computational Protein Design. J Comput Biol 23:737-49
Hallen, Mark A; Jou, Jonathan D; Donald, Bruce R (2016) LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid Rotamer-Like Efficiency. J Comput Biol :
Traoré, Seydou; Roberts, Kyle E; Allouche, David et al. (2016) Fast search algorithms for computational protein design. J Comput Chem 37:1048-58
Martin, Jeffrey W; Zhou, Pei; Donald, Bruce R (2015) Systematic solution to homo-oligomeric structures determined by NMR. Proteins 83:651-61
Hallen, Mark A; Gainza, Pablo; Donald, Bruce R (2015) Compact Representation of Continuous Energy Surfaces for More Efficient Protein Design. J Chem Theory Comput 11:2292-306

Showing the most recent 10 out of 31 publications