This proposal is entitled ?Binding-Site Modeling with Multiple-Instance Machine-Learning.? A number of in- terrelated computational methods for making predictions about the biological behavior of small molecules have been the subject of development within the Jain Laboratory for over twenty years. These share a common strat- egy that considers molecular interactions at their surface interface, where proteins and ligands actually interact. These methods yield measurements of similarity between small molecules or between protein binding pockets. They also yield measurements of the complementarity of a small molecule to a protein binding site (the molecular docking problem). A generalization of these concepts makes possible the construction of a virtual binding site for quantitative activity prediction purely from data about the biological activities of a set of small molecules. The goals of the proposed work include further improving the accuracy and breadth of applicability of the binding site modeling approach. The primary application of the approach is to guide optimization of leads within medicinal chemistry projects, and to quantify potential off-target effects during pre-clinical drug discovery. A critical focus of the work will be in data and software dissemination, in order to accelerate the efficient development of targeted therapies. In addition to methods development, the proposed work will involve broad application of these state-of-the-art predictive modeling methods. The proposed work will proceed with the col- laborative input of our pharmaceutical industry colleagues, who have specialized knowledge and data sets that are vital for cutting-edge work in computer-aided drug design. The expected results include more efficient lead optimization (fewer compounds to reach desired biological pa- rameters), truly effective scaffold replacement (to move away from a molecular series with biological limitations), and improved computational predictions of off-target effects during pre-clinical drug design.

Public Health Relevance

This project seeks to refine an integrated platform for physically realistic prediction of ligand binding affinities using multiple methods that span small molecule molecular similarity, molecular docking, and protein binding site similarity. These tools will provide predictive modeling unrestrained by scaffold congruence between what is known and what is to be predicted. Prediction of bioactive molecular poses and activities to guide lead optimiza- tion and to quantify off-target liability effects are applications of the effort, and data and software will be made widely available to academic and industrial research groups.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Biodata Management and Analysis Study Section (BDMA)
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Francisco
Schools of Pharmacy
San Francisco
United States
Zip Code
Cleves, Ann E; Jain, Ajay N (2018) Quantitative surface field analysis: learning causal models to predict ligand binding affinity and pose. J Comput Aided Mol Des :
Cleves, Ann E; Jain, Ajay N (2017) ForceGen 3D structure and conformer generation: from small lead-like molecules to macrocyclic drugs. J Comput Aided Mol Des 31:419-439
Cleves, Ann E; Jain, Ajay N (2016) Extrapolative prediction using physically-based QSAR. J Comput Aided Mol Des 30:127-52
Cleves, Ann E; Jain, Ajay N (2015) Chemical and protein structural basis for biological crosstalk between PPAR? and COX enzymes. J Comput Aided Mol Des 29:101-12
Cleves, Ann E; Jain, Ajay N (2015) Knowledge-guided docking: accurate prospective prediction of bound configurations of novel ligands using Surflex-Dock. J Comput Aided Mol Des 29:485-509
Yera, Emmanuel R; Cleves, Ann E; Jain, Ajay N (2014) Prediction of off-target drug effects through data fusion. Pac Symp Biocomput :160-71
Spitzer, Russell; Cleves, Ann E; Varela, Rocco et al. (2014) Protein function annotation by local binding site surface similarity. Proteins 82:679-94
Varela, Rocco; Cleves, Ann E; Spitzer, Russell et al. (2013) A structure-guided approach for protein pocket modeling and affinity prediction. J Comput Aided Mol Des 27:917-34
Varela, Rocco; Walters, W Patrick; Goldman, Brian B et al. (2012) Iterative refinement of a binding pocket model: active computational steering of lead optimization. J Med Chem 55:8926-42
Jain, Ajay N; Cleves, Ann E (2012) Does your model weigh the same as a duck? J Comput Aided Mol Des 26:57-67

Showing the most recent 10 out of 11 publications