Two overarching goals in chemical and structural biology are finding ligands for every protein (Schreiber, S. L., Nat Chem Biol 2005), and identifying the functions and associations among proteins based on their structure. For the last decade, these goals have been pursued empirically (Gerlt, J. A.;et al., Biochemistry 2011). I will argue that there is a need for computation in both ligand discovery and functional association of proteins. I will also argue that such goals, correctly framed, are now becoming obtainable for computational methods, and that at scale they are essentially only pragmatic computationally. In the last five years, structure-based docking screens have seen two important advances. First, the technique has predicted new chemotypes for over 60 targets. Whereas docking retains key liabilities, and cannot rank order ligands, it can now, with some reliability, prioritize likely molecules as candidates. Second, the technique has become fast. I will show that the method can screen the structurally accessible and interesting proteome-about 4000 targets-in four months. A second development is the advent of chemoinformatics techniques to compare targets by ligand similarity, rather than the structural or sequence similarity. It turns out that ligand similarity reveals associations that are curiously opaque to sequence and structure, and this has been used to make surprising and high-profile predictions of drug off-targets (Keiser, M. J.;et al., Relating protein pharmacology by ligand chemistry, Nat Biotech 2007,25(2), 197-206), drug mechanism of action (Gregori-Puigjane, E.;et al., Proc Natl Acad Sci U S A 2012;Keiser, M. J.;et al., Nature 2009), and drug toxicities (Lounkine, E.;et al., Nature 2012). Very recently the technique has been expanded to reorganize the GPCR dendogram (Lin, H.;et al., Nature Methods 2013, accepted), and I will take that tack to a whole proteome level by relating docking hit lists to associate proteins. The docking hit lists and the associations among them will enable three questions: "What compounds are available that might modulate my target", "what other targets might my target be associated with?" and, when appropriate, "what is the function of my target?" All results will be made accessible online.
My first goal is to predict accessible small molecules for essentially every protein in the structurally determined proteome, using molecular docking screens. The second goal is to relate the docking hit lists that emerge by ligand similarity, asking which might be functionally related. Selected predictions will be tested experimentally.