Realization of novel molecular function requires the ability to alter molecular complex formation. Enzymatic function can be altered by changing enzyme-substrate interactions via modification of an enzyme's active site. We propose a new algorithm for protein redesign, which combines a statistical mechanics-derived ensemble-based approach to computing the binding constant with the speed and completeness of a branch-and-bound pruning algorithm. In addition, we propose an efficient, deterministic approximation algorithm, capable of approximating our scoring function to arbitrary precision. Our ensemble-based algorithm, which flexibly models both protein and ligand using rotamer-based partition functions, has application in enzyme redesign, the prediction of protein-ligand binding, and computer-aided drug design. In preliminary studies, we redesigned the phenylalanine-specific adenylation domain of the non-ribosomal peptide synthetase Gramicidin Synthetase A (NRPS GrsA-PheA). Ensemble scoring, using a rotameric approximation to the partition functions of the bound and unbound states for GrsA-PheA, was used to switch the enzyme specificity toward leucine (Leu) and tyrosine (Tyr), using novel active site sequences computationally predicted by searching through the space of possible active site mutations. The top-scoring in silico mutants were created in vitro, and binding and catalytic activity were measured. Several of the top-ranked mutations exhibit the desired change in specificity from Phe to Leu or Tyr. When considering protein flexibility and molecular ensembles for protein design, a major challenge has been the development of ensemble-based redesign algorithms that efficiently prune mutations and conformations. The proposed K* (""""""""K-star"""""""") method generalizes Boltzmann-based scoring to ensembles and applies the result to protein design. K prunes the vast majority of conformations, thereby reducing execution time and making a mutation search that considers both ligand and protein flexibility computationally feasible. In addition to redesigning PheA, the K algorithm will be used to reprogram the specificity of other NRPS domains, whose products include natural antibiotics, antifungals, antivirals, immuno- suppressants, and antineoplastics. We will also use our algorithms to redesign two restriction endonucleases (REs), and will apply K to design peptide inhibitors for the CAL (Cystic fibrosis transmembrane conductance regulator Associated Ligand) PDZ domain. Our algorithms will predict NRPS and RE mutants with putative novel function, and we will create the mutant proteins, and test our predictions by using biochemical activity assays and determining new crystal structures. We will test the predicted CAL-binding peptides both in vitro and in vivo. Project Narrative: Enzyme redesign provides a good test of our understanding of proteins. The long-term goal of our research is to develop novel algorithms to plan structure-based site-directed mutations to a protein's active site in order to modify its function. We will develop general planning software that can reprogram the specificity of many proteins, including NRPS domains, whose products include natural antibiotics, antifungals, antivirals, immuno- suppressants, and antineoplastics. These engineered enzymes should enable combinatorial biosynthesis of novel pharmacologically-active compounds, yielding new leads for drug design. The proposed application of our algorithms to restriction endonucleases and the CAL PDZ domain could lead to (respectively) biotechnology advances and novel therapeutic interventions for cystic fibrosis.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM078031-04
Application #
8025987
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Hagan, Ann A
Project Start
2008-04-15
Project End
2014-01-31
Budget Start
2011-02-01
Budget End
2014-01-31
Support Year
4
Fiscal Year
2011
Total Cost
$316,780
Indirect Cost
Name
Duke University
Department
Biostatistics & Other Math Sci
Type
Other Domestic Higher Education
DUNS #
044387793
City
Durham
State
NC
Country
United States
Zip Code
27705
Qi, Yang; Martin, Jeffrey W; Barb, Adam W et al. (2018) Continuous Interdomain Orientation Distributions Reveal Components of Binding Thermodynamics. J Mol Biol 430:3412-3426
Ojewole, Adegoke A; Jou, Jonathan D; Fowler, Vance G et al. (2018) BBK* (Branch and Bound Over K*): A Provable and Efficient Ensemble-Based Protein Design Algorithm to Optimize Stability and Binding Affinity Over Large Sequence Spaces. J Comput Biol 25:726-739
Ojewole, Adegoke; Lowegard, Anna; Gainza, Pablo et al. (2017) OSPREY Predicts Resistance Mutations Using Positive and Negative Computational Protein Design. Methods Mol Biol 1529:291-306
Jain, Swati; Jou, Jonathan D; Georgiev, Ivelin S et al. (2017) A critical analysis of computational protein design with sparse residue interaction graphs. PLoS Comput Biol 13:e1005346
Hallen, Mark A; Jou, Jonathan D; Donald, Bruce R (2017) LUTE (Local Unpruned Tuple Expansion): Accurate Continuously Flexible Protein Design with General Energy Functions and Rigid Rotamer-Like Efficiency. J Comput Biol 24:536-546
Hallen, Mark A; Donald, Bruce R (2017) CATS (Coordinates of Atoms by Taylor Series): protein design with backbone flexibility in all locally feasible directions. Bioinformatics 33:i5-i12
Zhou, Yichao; Donald, Bruce R; Zeng, Jianyang (2017) Parallel Computational Protein Design. Methods Mol Biol 1529:265-277
Jou, Jonathan D; Jain, Swati; Georgiev, Ivelin S et al. (2016) BWM*: A Novel, Provable, Ensemble-based Dynamic Programming Algorithm for Sparse Approximations of Computational Protein Design. J Comput Biol 23:413-24
Gainza, Pablo; Nisonoff, Hunter M; Donald, Bruce R (2016) Algorithms for protein design. Curr Opin Struct Biol 39:16-26
Hallen, Mark A; Donald, Bruce R (2016) comets (Constrained Optimization of Multistate Energies by Tree Search): A Provable and Efficient Protein Design Algorithm to Optimize Binding Affinity and Specificity with Respect to Sequence. J Comput Biol 23:311-21

Showing the most recent 10 out of 35 publications