Accurately modeling protein or protein complex structure is considered a significant grand challenge that has broad economic and scientific impact. One of the key obstacles is the absence of a reliable sampling method that can efficiently explore the tremendously large protein conformation space. This CAREER project investigates efficient sampling approaches that can lead to prediction of high resolution protein structures with accuracy and reliability currently not achievable in computational protein modeling. The rationale is to integrate various physics- and knowledge-based scoring functions via multi-scoring functions sampling to explore the complex protein conformation space. The research work includes 1) establishing computational models for multi-scoring functions sampling in protein structure modeling with theoretically and mathematically rigorous justification; 2) designing novel sampling algorithms to efficiently explore large protein conformation space; 3) applying the sampling algorithms to important protein modeling applications including ab initio protein folding and protein-protein docking; and 4) developing a resource-efficient protein modeling programming paradigm.
The efficient sampling approaches developed in this research can be applied to a variety of important computational biology applications, which will provide a critical stepping stone toward reliable resolution improvement in protein models for practical use. Success of high resolution protein modeling will have significant impact on genomic study, disease research, bio-energy development, and the drug-design industry. In addition to its research impact, the educational goal of this CAREER project is to attract excellent students, particularly the minorities, to participate in computational biology research and pursue computational science career.