The specific interactions of proteins with their ligands are an origin of biological functions that are essential for living organisms. These molecular recognitions occur in local surface regions of proteins. With a rapid increase in the number of high-resolution protein structures and impressive advances in protein structure prediction, complete three-dimensional structural information of most organismal proteins is expected to be available soon. Our previous study indicates that similar binding sites occur in non-homologous protein structures, making it feasible to predict ligand binding sites and ligand structures from protein-ligand complex structures in the Protein Data Bank by comparing their binding sites. Based on these, our goal is to develop a high- performance computational toolset for structure-based protein-ligand interaction studies and drug discovery at the proteomic level by utilizing the local structural patterns of protein-ligand interactions from big biomolecular structure data and by detecting conserved local regions between protein structures.
In AIM 1, we will develop G-PLI-Predictor to predict ligand binding sites, putative ligand structures, and protein functions for hard targets and to design new ligands using a chemical fragment template-based approach.
In AIM 2, we will develop G- LoSALR, a coarse-grained version of our local structure alignment tool G-LoSA, and G-LBS-Refiner, a molecular dynamics simulation-based conformation sampling method guided by restraint potentials derived from structure templates to improve performance in structure library search. G-LoSALR will provide tolerance to conformational variations in protein structures and structural errors in predicted protein models upon structure alignment and similarity measurement. G-LBS-Refiner will provide more reliable binding site conformations by generating holo-conformations from an apo-structure or by refining low-resolution protein models.
In AIM3, we will develop G-Promis, a proteomic-scale ligand promiscuity prediction method. G-Promis will perform all structure comparisons of a query binding site structure with the whole surfaces of each protein in the proteome structure library to identify a set of potential protein targets and then examine approximate binding affinities between a query ligand and the target proteins. These web services and/or standalone toolkits will be freely available to all academic users and not-for-profit institutions. The proposed research will provide reliable and general computational methods to students and researchers in the biology community and other disciplines, enabling to foster synergistic scientific research and education on protein-ligand interactions and facilitating drug development.

Public Health Relevance

(RELEVANCE) Proteins not only interact with diverse ligands on the local surface regions, but also carry out their biological processes by coordinating through complex networks of transient interactions, rather than acting in isolation. This proposal aims to develop a high-performance computational toolset for structure-based protein-ligand interaction studies and drug discovery at the proteomic level by harnessing big biomolecular structure data. The resulting tools will be freely available to other researchers and can be ultimately applied to better understand the molecular mechanisms of pathophysiological pathways and facilitate drug development for the prevention and treatment of human diseases.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Lehigh University
Schools of Arts and Sciences
United States
Zip Code