The proposed work will accelerate the pace of drug discovery by developing, validating, and testing new methods, tools, and resources for structure-based drug design. Two fundamental challenges of structure-based drug design are the accurate scoring and ranking of protein-ligand structures, which identi?es active com- pounds, and the ability to ef?ciently search a large number of ligands, which ensures that active compounds are sampled. This proposal will address these challenges by developing a novel approach for protein-ligand scoring and expanding the size of the chemical space that can be ef?ciently searched during lead optimiza- tion. The methods will be validated by their prospective application toward the discovery of new anti-cancer molecules and will be made readily accessible through online resources and open-source tools. The proposal leverages recent and signi?cant advances in deep learning and image recognition to develop scoring functions that accurately recognize high-af?nity protein-ligand interactions. This is achieved by design- ing and training convolutional neural nets on three-dimensional representations of protein-ligand structures to discriminate between binders and non-binders. Convolutional neural net training will exploit large datasets of af?nity and structural data to automatically extract the relevant features necessary to accurately prioritize compounds. Additionally, the proposal develops the ?rst means of fully integrating a convolutional neural net scoring function directly into an energy minimization and docking work?ow. Interactive virtual screening enables the search of millions of compounds in a few seconds so that queries can be interactively optimized. Interactivity enables the synergistic uni?cation of human expert knowledge and ef?cient computational algorithms. The proposed work will dramatically expand the size of chemical space ac- cessible through interactive virtual screening. Algorithms for ef?ciently searching the chemical space of billions or trillions of compounds implicitly de?ned by a set of reaction schemas and fragments will be created as part of a lead optimization work?ow. Fragment-oriented search will be accelerated by a new data structure that combines pharmacophore and molecular shape information into a single sub-linear time index. The scoring and lead optimization methods developed in this proposal will be released as open-source soft- ware and made immediately available through open-access online resources. As part of the prospective valida- tion of the proposed methods, these resources will be used to identify hit compounds and optimize leads for two targets related to cancer metabolism: serine hydroxymethyltransferase and kidney glutaminase isoform C. Successful completion of the objectives of this proposal will positively impact public health by reducing the cost and time-to-market of developing new drugs, particularly with respect to novel protein targets.

Public Health Relevance

Researchers will be able to more quickly and accurately identify potential drug candidates using the new methods, tools, and resources for structure-based drug discovery created by this project. These tools will be readily accessible through open-access web sites and open-source software. As part of the project these tools will be used to identify new molecules that are relevant to the treatment of cancer.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Macromolecular Structure and Function D Study Section (MSFD)
Program Officer
Preusch, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pittsburgh
Schools of Medicine
United States
Zip Code
Hochuli, Joshua; Helbling, Alec; Skaist, Tamar et al. (2018) Visualizing convolutional neural network protein-ligand scoring. J Mol Graph Model 84:96-108
Sunseri, Jocelyn; King, Jonathan E; Francoeur, Paul G et al. (2018) Convolutional neural network scoring and minimization in the D3R 2017 community challenge. J Comput Aided Mol Des :
Koes, David R; Dömling, Alexander; Camacho, Carlos J (2018) AnchorQuery: Rapid online virtual screening for small-molecule protein-protein interaction inhibitors. Protein Sci 27:229-232
Gau, David; Lewis, Taber; McDermott, Lee et al. (2018) Structure-based virtual screening identifies a small-molecule inhibitor of the profilin 1-actin interaction. J Biol Chem 293:2606-2616
Koes, David R; Vries, John K (2017) Evaluating amber force fields using computed NMR chemical shifts. Proteins 85:1944-1956
Ragoza, Matthew; Hochuli, Joshua; Idrobo, Elisa et al. (2017) Protein-Ligand Scoring with Convolutional Neural Networks. J Chem Inf Model 57:942-957
Koes, David R; Vries, John K (2017) Error assessment in molecular dynamics trajectories using computed NMR chemical shifts. Comput Theor Chem 1099:152-166
Pirhadi, Somayeh; Sunseri, Jocelyn; Koes, David Ryan (2016) Open source molecular modeling. J Mol Graph Model 69:127-43
Sunseri, Jocelyn; Ragoza, Matthew; Collins, Jasmine et al. (2016) A D3R prospective evaluation of machine learning for protein-ligand scoring. J Comput Aided Mol Des 30:761-771
Hain, Ethan; Camacho, Carlos J; Koes, David Ryan (2016) Fragment oriented molecular shapes. J Mol Graph Model 66:143-54

Showing the most recent 10 out of 14 publications