Chemical space is big data: the number of drug-like molecules exceeds 10^60. Experimentally screening compound libraries for drug candidates is a time consuming and expensive process. Virtual screening is a cheaper, faster approach for identifying potential drug candidates. Existing virtual screening methods typically scale linearly with the size of the compound library. A virtual screen of a million compounds may take days and requires a significant investment in computational infrastructure. The lack of scalable virtual screening algorithms and the difficulty in accessing the infrastructure necessary to perform large-scale virtual screening severely limits the ability of researchers to explore the big data of chemical space. This research plan will develop scalable virtual screening algorithms that will enable virtual screening on an interactive time scale (seconds to minutes). Interactive algorithms support the integration of expert human insight and knowledge with computational methods and permit rapid hypothesis testing and exploration. These interactive algorithms will be deployed both as open-source software and as part of an online drug discovery collaboration environment. The online environment will provide immediate access to the big data infrastructure needed to enable rapid and collaborative online virtual screening. Algorithms for filtering compound libraries based on pharmacophore and molecular shape properties will be developed. Unlike current approaches, these algorithms will scale with the breadth and complexity of the query, not with the size of the compound database, enabling scalable and rapid filtering of billions of chemical structures. Efficient methods for ranking the filtered resuts that harness the computational power of modem graphics processing units will also be developed. Backed by the appropriate computational resources, these algorithms will support the screening of billions of chemical structures on an interactive time-scale. The interactive performance of the tools will support rapid hypothesis testing and experimentation, and users will be able to submit their own compound libraries for screening, encouraging cross-discipline collaboration.
The proposed research will result in novel algorithms and systems for the storage, retrieval, and analysis of chemical data to support the rapid identification of compounds of therapeutic interest. Successful application of these algorithms will reduce the cost and time of development of new drugs.
Koes, David R; Dömling, Alexander; Camacho, Carlos J (2018) AnchorQuery: Rapid online virtual screening for small-molecule protein-protein interaction inhibitors. Protein Sci 27:229-232 |
Gau, David; Lewis, Taber; McDermott, Lee et al. (2018) Structure-based virtual screening identifies a small-molecule inhibitor of the profilin 1-actin interaction. J Biol Chem 293:2606-2616 |
Hochuli, Joshua; Helbling, Alec; Skaist, Tamar et al. (2018) Visualizing convolutional neural network protein-ligand scoring. J Mol Graph Model 84:96-108 |
Sunseri, Jocelyn; King, Jonathan E; Francoeur, Paul G et al. (2018) Convolutional neural network scoring and minimization in the D3R 2017 community challenge. J Comput Aided Mol Des : |
Koes, David R; Vries, John K (2017) Evaluating amber force fields using computed NMR chemical shifts. Proteins 85:1944-1956 |
Ragoza, Matthew; Hochuli, Joshua; Idrobo, Elisa et al. (2017) Protein-Ligand Scoring with Convolutional Neural Networks. J Chem Inf Model 57:942-957 |
Koes, David R; Vries, John K (2017) Error assessment in molecular dynamics trajectories using computed NMR chemical shifts. Comput Theor Chem 1099:152-166 |
Pirhadi, Somayeh; Sunseri, Jocelyn; Koes, David Ryan (2016) Open source molecular modeling. J Mol Graph Model 69:127-43 |
Sunseri, Jocelyn; Ragoza, Matthew; Collins, Jasmine et al. (2016) A D3R prospective evaluation of machine learning for protein-ligand scoring. J Comput Aided Mol Des 30:761-771 |
Hain, Ethan; Camacho, Carlos J; Koes, David Ryan (2016) Fragment oriented molecular shapes. J Mol Graph Model 66:143-54 |
Showing the most recent 10 out of 14 publications