Most proteins conduct their functions through interactions with other proteins. The atomic-level quaternary structure of protein-protein complexes can provide a clear physical landscape to help our understanding of how the interactions are conducted in living cells and how new therapies can be designed to regulate the interaction networks. Since experimental characterization of complex structures is difficult and expensive, computational modeling of the protein-protein interactions has been a major theme in computational biology. Most efforts have been focused on rigid-body docking, which builds complex conformations by combining known structures of interacting components. But docking is applicable only when the monomer structures are known and the success rate is low when components involve conformational change upon binding. Alternatively, complex structures can be deduced from homologous structures with alignments generated by the multi-chain threading technique. While the latter approach has the advantage of not requiring solved monomer structures, the modeling accuracy for distant-homology targets is unreliable and the threading alignments generally have gaps and errors. In this project, we seek to develop a new generation of computational approaches aiming to significantly improve the coverage and accuracy of protein-protein complex structure modeling by the integration of the cutting-edge rigid-body docking and threading assembly simulations.
The specific aims i nclude: (1) development of new interface-specific threading algorithms for distant-homology detection; (2) new fragment assembly simulation method for full-length complex structure construction and refinement; (3) development of new strategies for ab initio docking; (4) integration of the threading and docking methods for low-resolution docking and template-based docking structure refinement. The algorithms will be systematically trained on large-scale benchmark protein sets and tested in community-wide docking experiments, with focus on modeling the binding-induced conformational changes and predicting high-resolution complex structures for distantly homologous proteins. The methods and potentials developed in this project will be made freely available to the general community through Internet websites. The long-term goals of this project are (a) to develop advanced computer methods for accurate structure modeling of various protein-protein complexes, and (b) to utilize the methods for genome-wide structure modeling and structure-based function annotation of protein-protein networks of various organisms.

Public Health Relevance

Protein-protein interactions are responsible for the development of pathological processes such as Alzheimer's disease and cancer. The atomic structures of protein-protein complexes are needed for designing synthetic compounds and biologics to disrupt or enhance protein-protein interactions. The goal of this project is to develo computational algorithms to build atomic structure of the complexes from amino acid sequences. The success of such method developments can result in important impact on drug discovery and public health.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Michigan Ann Arbor
Biostatistics & Other Math Sci
Schools of Medicine
Ann Arbor
United States
Zip Code
Zhang, Chengxin; Zheng, Wei; Freddolino, Peter L et al. (2018) MetaGO: Predicting Gene Ontology of Non-homologous Proteins Through Low-Resolution Protein Structure Prediction and Protein-Protein Network Mapping. J Mol Biol 430:2256-2265
Diamond, Justin S; Zhang, Yang (2018) THE-DB: a threading model database for comparative protein structure analysis of the E. coli K12 and human proteomes. Database (Oxford) 2018:
Wu, Jiansheng; Zhang, Qiuming; Wu, Weijian et al. (2018) WDL-RF: predicting bioactivities of ligand molecules acting with G protein-coupled receptors by combining weighted deep learning and random forest. Bioinformatics 34:2271-2282
Xia, Chun-Qiu; Han, Ke; Qi, Yong et al. (2018) A Self-Training Subspace Clustering Algorithm under Low-Rank Representation for Cancer Classification on Gene Expression Data. IEEE/ACM Trans Comput Biol Bioinform 15:1315-1324
Virtanen, Jouko J; Zhang, Yang (2018) MR-REX: molecular replacement by cooperative conformational search and occupancy optimization on low-accuracy protein models. Acta Crystallogr D Struct Biol 74:606-620
Hu, Jun; Li, Yang; Zhang, Yang et al. (2018) ATPbind: Accurate Protein-ATP Binding Site Prediction by Combining Sequence-Profiling and Structure-Based Comparisons. J Chem Inf Model 58:501-510
Dong, Runze; Peng, Zhenling; Zhang, Yang et al. (2018) mTM-align: an algorithm for fast and accurate multiple protein structure alignment. Bioinformatics 34:1719-1725
Hu, Jun; Liu, Zi; Yu, Dong-Jun et al. (2018) LS-align: an atom-level, flexible ligand structural alignment algorithm for high-throughput virtual screening. Bioinformatics 34:2209-2218
Vreven, Thom; Schweppe, Devin K; Chavez, Juan D et al. (2018) Integrating Cross-Linking Experiments with Ab Initio Protein-Protein Docking. J Mol Biol 430:1814-1828
Keasar, Chen; McGuffin, Liam J; Wallner, Björn et al. (2018) An analysis and evaluation of the WeFold collaborative for protein structure prediction and its pipelines in CASP11 and CASP12. Sci Rep 8:9939

Showing the most recent 10 out of 25 publications