The application of computer-aided drug design (CADD) to drug discovery has yet to reach its full potential despite the broad use of these methods across academia and the pharmaceutical industry and tremendous progress in CPU speeds over the last 35 years. Although the existing computational methods are useful, there are serious limitations in the ability to predict small molecule ligand-protein target interactions It is recognized that if these obstacles can be overcome, the ability to predict these interactions accurately would have a dramatic and positive outcome through a reduction in small molecule drug discovery timelines and potentially toxic off-target effects, thereby reducing overall development costs and increasing safety of new medications. The computer-aided drug design community is working to develop improved methods and generally agrees that further progress requires greater public availability of high quality, compelling and "problem" specific protein-ligand datasets for challenging, improving and validating computational algorithms. The NIH seeks to solve this problem through the issuance of RFA GM-08-008, "Drug Docking and Screening Data Resource". We submit a proposal to establish a publicly available Drug Design Data Resource (D3R) to meet the goals of this RFA. We propose three innovative CADD community oriented Aims. First we will engage our pharmaceutical partners to identify, curate and enhance 6-10 protein- ligand datasets per year. This work will build on the existing CSAR project both through rapid incorporation of their datasets and perpetuating their academic-industry relationships. An innovative extension beyond CSAR will be longer tenures at pharmaceutical companies to allow further exchange of ideas and testing of data with various workflows, workflows that can be made publicly available. We will engage contract research organizations for compound synthesis and in vitro biochemical assays and our academic partners for novel thermodynamic data. Second, we will use these new datasets as a basis for engaging the CADD community in quarterly blind prediction exercises, focusing on binding mode and affinity predictions for ligand-protein interactions. This blind challenge approach has proven highly fruitful in other fields, e.g. protein folding. Workshops will be held to share resuls and discuss their implications. Third, we will develop a database and web presence to archive, share, and integrate these data, as well as workflows submitted by users to enable replication and dissemination of their methods. This community-oriented effort will establish a powerful new platform for advancing the state of the art in computer-aided drug design.
Computers are used to help design new drugs, and there is great promise for computational methods to become even more precise and effective than they are today. However, creating improved methods will require giving computational scientists access to many more experimental measurements of the properties of drugs and drug-like molecules than are now available, so that advanced methods can be tested and optimized. The present project will meet this need, and thus ultimately speed the discovery of new medications, by collecting the required data from research laboratories in the pharmaceutical industry and universities and organizing them in an electronic database so they can be easily accessed and used.