To predict the three-dimensional structures of a protein solely from its primary sequence remains a grand and elusive challenge in modern computational biology. Molecular dynamics simulation has a high promise for predicting protein structures and folding pathways at molecular details. Recent advances in im- proved computer hardware and enhanced sampling methods have made it possible to ab initio fold proteins of larger size. The highlight of the improved computer hardware is Anton, a massively parallel special-purpose supercomputer designed by D.E. Shaw Research. Anton successfully folded the D14A fast-folding mutant of the 80-residue l-repressor, which was achieved at 49 microseconds (?s) in 643?s-long simulations. On the other hand, the latest advance in enhanced sampling methods is represented by the single-copy continuous simulated tempering (CST) method developed by the PI?s group. The group of Dr. Klaus Schulten incorpo- rated the CST method into the NAMD package, which repeatedly folded the 80-residue l-repressor HG mutant from a fully extended conformation to the native state at 0.5 and 4?s in 10?s-long simulations with Ca root- mean-square deviations (Ca-RMSD) of 1.7 on a conventional computing platform. In marked contrast, a complete folding of the same protein was NOT observed using Anton at multiple temperatures even in 100?s- long simulations. This performance of CST in folding simulation has never been matched by any other sam- pling method for similar purposes on conventional computing platforms. Most recently, to further enhance sampling efficiencies in studying larger systems, the PI has developed a more powerful parallel CST (PCST) method. Initial ab initio folding simulation of trp-cage clearly demonstrated that the efficiency of PCST in facili- tating multiple folding and unfolding events was even drastically superior to that of CST. The PCST method serves as a solid foundation for the proposed research in three Specific Aims: 1). Development of the PCST method for enhanced sampling; 2). Design of advanced temperature-dependent restraint schemes for targeted sampling; 3). Development of advanced blind model selection methods for efficient target se- lection. Our in-depth preliminary studies demonstrate that these new methods clearly outperformed all exist- ing methods and suggest a high promise of success for the proposed research. Ultimately, these powerful new algorithms will provide urgently-needed tools for protein simulations, and offer an effective solution for structural refinement in experimental X-ray crystallography and electron cryo-microscopy.

Public Health Relevance

To predict the three-dimensional structures of a protein solely from its primary sequence remains a grand and elusive challenge in modern computational biology. The proposed study aims to develop a set of computational tools to bring us closer toward this goal. The implementation and release of these computation- al methods to the entire scientific community will expedite the pursuits for high-accuracy structures of biomedi- cally important proteins, thus directly benefiting disease prevention and treatment.

Agency
National Institute of Health (NIH)
Institute
National Institute of General Medical Sciences (NIGMS)
Type
Research Project (R01)
Project #
5R01GM127628-02
Application #
9695997
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Lyster, Peter
Project Start
2018-06-01
Project End
2022-02-28
Budget Start
2019-03-01
Budget End
2020-02-29
Support Year
2
Fiscal Year
2019
Total Cost
Indirect Cost
Name
Baylor College of Medicine
Department
Biochemistry
Type
Schools of Medicine
DUNS #
051113330
City
Houston
State
TX
Country
United States
Zip Code
77030
Du, Junqing; Kirk, Brian; Zeng, Jia et al. (2018) Three classes of response elements for human PRC2 and MLL1/2-Trithorax complexes. Nucleic Acids Res 46:8848-8864
Lin, Xingcheng; Noel, Jeffrey K; Wang, Qinghua et al. (2018) Atomistic simulations indicate the functional loop-to-coiled-coil transition in influenza hemagglutinin is not downhill. Proc Natl Acad Sci U S A 115:E7905-E7913