The application of computer-aided drug design (CADD) to drug discovery has yet to reach its full potential despite the broad use of these methods across academia and the pharmaceutical industry and tremendous progress in CPU speeds over the last 35 years. Although the existing computational methods are useful, there are serious limitations in the ability to predict small molecule ligand-protein target interactions It is recognized that if these obstacles can be overcome, the ability to predict these interactions accurately would have a dramatic and positive outcome through a reduction in small molecule drug discovery timelines and potentially toxic off-target effects, thereby reducing overall development costs and increasing safety of new medications. The computer-aided drug design community is working to develop improved methods and generally agrees that further progress requires greater public availability of high quality, compelling and """"""""problem"""""""" specific protein-ligand datasets for challenging, improving and validating computational algorithms. The NIH seeks to solve this problem through the issuance of RFA GM-08-008, """"""""Drug Docking and Screening Data Resource"""""""". We submit a proposal to establish a publicly available Drug Design Data Resource (D3R) to meet the goals of this RFA. We propose three innovative CADD community oriented Aims. First we will engage our pharmaceutical partners to identify, curate and enhance 6-10 protein- ligand datasets per year. This work will build on the existing CSAR project both through rapid incorporation of their datasets and perpetuating their academic-industry relationships. An innovative extension beyond CSAR will be longer tenures at pharmaceutical companies to allow further exchange of ideas and testing of data with various workflows, workflows that can be made publicly available. We will engage contract research organizations for compound synthesis and in vitro biochemical assays and our academic partners for novel thermodynamic data. Second, we will use these new datasets as a basis for engaging the CADD community in quarterly blind prediction exercises, focusing on binding mode and affinity predictions for ligand-protein interactions. This blind challenge approach has proven highly fruitful in other fields, e.g. protein folding. Workshops will be held to share resuls and discuss their implications. Third, we will develop a database and web presence to archive, share, and integrate these data, as well as workflows submitted by users to enable replication and dissemination of their methods. This community-oriented effort will establish a powerful new platform for advancing the state of the art in computer-aided drug design.

Public Health Relevance

Computers are used to help design new drugs, and there is great promise for computational methods to become even more precise and effective than they are today. However, creating improved methods will require giving computational scientists access to many more experimental measurements of the properties of drugs and drug-like molecules than are now available, so that advanced methods can be tested and optimized. The present project will meet this need, and thus ultimately speed the discovery of new medications, by collecting the required data from research laboratories in the pharmaceutical industry and universities and organizing them in an electronic database so they can be easily accessed and used.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project--Cooperative Agreements (U01)
Project #
Application #
Study Section
Special Emphasis Panel (ZGM1)
Program Officer
Wehrle, Janna P
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of California San Diego
Schools of Arts and Sciences
La Jolla
United States
Zip Code
Gaieb, Zied; Liu, Shuai; Gathiaka, Symon et al. (2018) D3R Grand Challenge 2: blind prediction of protein-ligand poses, affinity rankings, and relative binding free energies. J Comput Aided Mol Des 32:1-20
Yin, Jian; Henriksen, Niel M; Slochower, David R et al. (2017) The SAMPL5 host-guest challenge: computing binding free energies andĀ enthalpies from explicit solvent simulations by the attach-pull-release (APR) method. J Comput Aided Mol Des 31:133-145
Yin, Jian; Henriksen, Niel M; Slochower, David R et al. (2017) Overview of the SAMPL5 host-guest challenge: Are we doing better? J Comput Aided Mol Des 31:1-19
Shirts, Michael R; Klein, Christoph; Swails, Jason M et al. (2017) Lessons learned from comparing molecular dynamics engines on the SAMPL5 dataset. J Comput Aided Mol Des 31:147-161
Bannan, Caitlin C; Burley, Kalistyn H; Chiu, Michael et al. (2016) Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge. J Comput Aided Mol Des 30:927-944
Adams, Paul D; Aertgeerts, Kathleen; Bauer, Cary et al. (2016) Outcome of the First wwPDB/CCDC/D3R Ligand Validation Workshop. Structure 24:502-508
Gathiaka, Symon; Liu, Shuai; Chiu, Michael et al. (2016) D3R grand challenge 2015: Evaluation of protein-ligand pose and affinity predictions. J Comput Aided Mol Des 30:651-668
Friedrich, Joachim (2015) Efficient Calculation of Accurate Reaction Energies-Assessment of Different Models in Electronic Structure Theory. J Chem Theory Comput 11:3596-609