The development of highly efficient and accurate approaches to structure-based virtual screening (VS) continues to represent a formidable challenge in the field of computational drug discovery. Outstanding and widely recognized research problems in the field include the relative computational inefficiency of most approaches, which limits the size of molecular libraries used for virtual screening; the low hit rate; and the inaccurate prediction of ligand binding affinity and pose. The proposed studies address these challenges by using innovative and computationally efficient approaches to VS that fully integrate concepts from the complementary fields of cheminformatics and molecular simulation to devise an integrated two-step VS methodology. Building upon our experience in cheminformatics and QSAR modeling, we aim to develop novel, computationally efficient cheminformatics approaches to pre-process very large (on the order of 107 compounds) chemical libraries available for biological screening, and eliminate up to 99% of improbable ligands. Only the remaining 1% of probable ligands will be evaluated by slower but accurate ensemble flexible docking approaches relying on molecular simulation techniques. The cheminformatics step will also produce important information on privileged protein-ligand interactions that will be used in a live-processing step to guide the structure-based virtual screening and avoid oversampling of ligand poses. Moreover, post- processing cheminformatics methods will be implemented to filter out decoy poses from docking calculations. The ultimate goal of our hybrid methodology is to arrive at a small set of high-affinity computational hits in receptor-bound conformations that can be validated experimentally. We will pursue this goal following three specific aims: 1) Develop novel cheminformatics-based virtual screening approaches to eliminate both improbable ligands and improbable poses, as well as generate information on preferred protein-ligand interactions; 2) Develop new, efficient flexible ensemble docking methods guided by the preferred protein- ligand interactions to select the most probable ligands and predict their binding poses; 3) Apply the developed hierarchical virtual screening workflow to several therapeutic targets and test high-confidence computational hits in experimental assays. All computational tools resulting from this project will be made publicly available. This proposal is innovative because the proposed VS platform will result from a unique marriage of disparate approaches for VS, combining their corresponding strengths. This proposal is significant because the implementation of this project will enable substantial improvement in the efficiency, accuracy, and experimentally-confirmed impact of structure-based drug discovery tools.

Public Health Relevance

Advances in drug discovery rely on the development of novel effective computational methodologies. This proposal advances an efficient and robust computational workflow for structure-based virtual screening of very large chemical libraries. The ultimate goal of this project is to arrive at a small number of candidate molecules with high predicted binding affinity to their biological targets, which will be tested in confirmatory experiments.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Research Project (R01)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Hagan, Ann A
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Pennsylvania State University
Schools of Medicine
United States
Zip Code
Capuzzi, Stephen J; Sun, Wei; Muratov, Eugene N et al. (2018) Computer-Aided Discovery and Characterization of Novel Ebola Virus Inhibitors. J Med Chem 61:3582-3594
Shobair, Mahmoud; Popov, Konstantin I; Dang, Yan L et al. (2018) Mapping allosteric linkage to channel gating by extracellular domains in the human epithelial sodium channel. J Biol Chem 293:3675-3684
Dagliyan, Onur; Krokhotin, Andrey; Ozkan-Dagliyan, Irem et al. (2018) Computational design of chemogenetic and optogenetic split proteins. Nat Commun 9:4042
Zhu, Cheng; Beck, Matthew V; Griffith, Jack D et al. (2018) Large SOD1 aggregates, unlike trimeric SOD1, do not impact cell viability in a model of amyotrophic lateral sclerosis. Proc Natl Acad Sci U S A 115:4661-4665
Han, Qingjian; Liu, Di; Convertino, Marino et al. (2018) miRNA-711 Binds and Activates TRPA1 Extracellularly to Evoke Acute and Chronic Pruritus. Neuron 99:449-463.e6
Williams 2nd, Benfeard; Zhao, Bo; Tandon, Arpit et al. (2017) Structure modeling of RNA using sparse NMR constraints. Nucleic Acids Res 45:12638-12647
Dronamraju, Raghuvar; Ramachandran, Srinivas; Jha, Deepak K et al. (2017) Redundant Functions for Nap1 and Chz1 in H2A.Z Deposition. Sci Rep 7:10791
Brodie, Nicholas I; Popov, Konstantin I; Petrotchenko, Evgeniy V et al. (2017) Solving protein structures using short-distance cross-linking constraints as a guide for discrete molecular dynamics simulations. Sci Adv 3:e1700479