Proteins facilitate nearly every useful function in biology, and there is enormous potential in designing proteins to perform completely new functions. These new functions might be useful in industry, enabling the manufacture of a new chemical, or in health care or the environment. However, many functions are currently too complex to design by current predictive methods. This project seeks to address a major bottleneck by advancing computational models to include multiple functional states of the protein. The project will then utilize these multi-state models to provide fundamental insights into design principles of how proteins function both at the molecular level and in the complex environment of living cells. More broadly, the resulting new methods and knowledge will advance academic and industrial design of useful new proteins such as biocatalysts for cost-effective and environmentally responsible manufacturing. The developed computational methods will be integrated into the protein design software suite Rosetta, which is available freely as source code to a large community of academic users, and is licensed by several biotechnological and pharmaceutical companies. New design methods will be integrated into graduate-level courses, and will be incorporated in research and outreach activities focused on students from socioeconomically disadvantaged backgrounds, underrepresented groups, and women in computational sciences.

Progress in the ability to design proteins with functions not existing in nature has been limited by an incomplete understanding of the key constraints on protein function that need to be captured by a predictive model. To advance modeling and design of proteins, the objectives of this project are to determine molecular and cellular constraints on protein function, and to incorporate this knowledge into improved computational methods to design proteins under multiple functional constraints. This project will first establish an experimental system that allows for a comprehensive examination of the effect of sequence changes on function, and the ability to predict these effects computationally. Experiments will interrogate the metabolic enzyme dihydrofolate reductase using deep mutational scanning and in vivo assays that will determine how fitness effects of thousands of variants change in different backgrounds that perturb molecular function and cellular context. Second, the project seeks to advance computational protein design methods to model protein sequences under multiple constraints. Tests of the model by comparison to experiments will guide critically needed improvements and will identify cellular constraints not captured by the multi-state model. The resulting knowledge will lead to improved computational methods to model and design proteins by considering multiple functional constraints that can be applied to important challenges in engineering new and useful protein functions.

This award is supported by the Systems and Synthetic Biology Program and the Molecular Biophysics Program in the Division of Molecular and Cellular Biosciences, and the Chemistry of Life Processes Program in the Division of Chemistry.

Agency
National Science Foundation (NSF)
Institute
Division of Molecular and Cellular Biosciences (MCB)
Application #
1615990
Program Officer
David Rockcliffe
Project Start
Project End
Budget Start
2016-08-01
Budget End
2021-07-31
Support Year
Fiscal Year
2016
Total Cost
$995,930
Indirect Cost
Name
University of California San Francisco
Department
Type
DUNS #
City
San Francisco
State
CA
Country
United States
Zip Code
94103