This subproject is one of many research subprojects utilizing the resources provided by a Center grant funded by NIH/NCRR. Primary support for the subproject and the subproject's principal investigator may have been provided by other sources, including other NIH sources. The Total Cost listed for the subproject likely represents the estimated amount of Center infrastructure utilized by the subproject, not direct funding provided by the NCRR grant to the subproject or subproject staff. Predicting the tolerated mutational space of proteins has important applications for modeling fundamental properties of proteins and their evolution;it also drives progress in protein design. Here we develop a computational model to predict the tolerated sequence space reachable by single mutations. We assess the model by comparison to the observed variability in more than 50,000 HIV-1 protease sequences, one of the most comprehensive datasets available. The model integrates multiple structural and functional constraints acting on the protease and captures the emergence of resistance mutations. A key feature of the model is the simulation of sequences using ensembles of protein conformations and we show comparable prediction accuracy when using crystallographic determined or computationally generated structural ensembles. As proteins often use intrinsic sequence plasticity to evolve new properties, accurate predictions of tolerated sequence space afforded by this model should enable computational design and directed evolution of protein functions.
Showing the most recent 10 out of 508 publications