The study of protein/ligand binding is one of the central problems in computational biology because of its importance in understanding intermolecular interactions, and because of its practical payoff in drug discovery efforts. The transformative impact accurate target/ligand structure can have in the design of next generation medicines cannot be overstated. If we could routinely and accurately design molecules using these approaches it would revolutionize drug discovery by winnowing out compounds with no activity while focusing more effort and scrutiny on highly active compounds. Determining the structure of a small molecule (drug candidate or lead compound) bound to a biological receptor (protein implicated in disease) is a necessary step in this approach to drug discovery. X-ray techniques provide astounding insights into the structure of protein- ligand complexes, but can be hampered by the resolution to which a crystal diffracts and the refinement process can be hampered by the lack of good potentials for novel small molecule compounds. We have extended our linear-scaling semiempirical quantum mechanical (QM) X-ray refinement approach and applied it to this field with great success. This approach has proven itself to be robust enough for routine QM-based X- ray refinement and it is currently being successfully marketed. However, since refinement methods are ultimately built on optimization algorithms and do not include sampling, they all suffer from what is termed a ?limited radius of convergence.? Therefore, crystallographic workflows - automatic and manual - include ligand placement as part of the model building process. Conventional automatic procedures for ligand placement are resolution dependent and are unable to take into account the chemistry of the active site. Further, the ligand conformation is often so highly strained that X-ray refinement alone, is unable to deduce the proper structure. When this happens, significant intervention on the part of the crystallographer is required, which increases expense and decreases productivity. In this proposal we describe a novel method we call Movable Type (MT), which addresses the protein ligand binding and scoring problem using fundamental statistical mechanics combined with a novel way to generate the ensemble of a ligand in a protein binding pocket. Via a rapid assembly of the necessary partition functions we directly obtain binding free energies and the low free energy poses. Conceptually, the MT method is analogous to block and type set printing, which allows us to efficiently evaluate partition functions describing regions or systems of interest. In this approach we construct two databases that 1) describe the probability of certain pairwise interactions as a function of r obtained from a knowledge base (Protein Databank (PDB) or the Cambridge Structural Database (CSD)) and 2) the energetics of the pairwise interactions as a function of r obtained from empirical potentials, which can be either derived from the probabilities or can utilize extant pairwise potentials like AMBER. Overall, the MT method is a general one and can use a broad range of two- body potential functions and can be extended to higher-order interactions if so desired. In this project we will extend the MT method and deliver this methodology to X-ray crystallographers and computational chemists for use in automated ligand placement within the experimental density during X-ray refinement. This work will involve development of a new, automated tool to find the active site ligand density and place the ligand within that density using the MT method. We will commercially deploy the technology, construct graphical user interfaces for use in MOE, Phenix, and our web-based cloud platform. Finally, this software will be used in real life structure-based drug discovery problems with our pharmaceutical collaborators (see Letters of Support).
We will enhance and deploy the movable type method to address the prediction of small molecule binding to drug targets. The resultant methodology and computer program will impact our ability to rapidly and routinely identify biologically active molecules than could be useful in the treatment of a number of diseased states.