Biocomputation across distributed private datasets to enhance drug discovery

Ekins, Sean

Abstract

Collaborative Drug Discovery, Inc. (CDD) will create a novel web-based software platform that enables scientists to work together effectively to discover and improve new drug leads by sharing computational predictions based on open-source descriptors and models, for the first time without needing to reveal underlying chemical structures and biodata. It will create the first practical system of bio computational analysis across distributed datasets with different owners, while respecting data privacy. By lowering this key barrier to collaboration the platform will accelerate the pre-clinical drug discovery pipeline.
Research aim ed at neglected diseases and orphan indications will especially benefit, because they often rely on the loosely affiliated efforts of academic investigators, non-profit foundations, government laboratories, and small biotechnology firms (extra-pharma entities). Such efforts typically lack not only the resources but also the integrated workflows of discovery projects conducted at large pharmaceutical companies (within which data can be shared freely across departments). The project will for the first time enable researchers focused on neglected diseases and orphan indications to effectively exploit bio computational tools such as virtual screening and ADME/Tox predictions, which are now considered to be standard and indispensible components of early discovery workflows within large pharma. It will also make it easier for these extra-pharma researchers to collaborate with large pharma and benefit from large pharma's significant investment accumulating large high-quality datasets. In Phase II of this SBIR project, CDD will: 1. Create a stand-alone platform, based entirely on open source technologies, that enables researchers to share models, share predictions from models, and create models from distributed, heterogeneous QSAR data - all without needing to divulge the underlying training sets. 2. Develop approaches that enable scientists who are not computational chemists to exploit the technology. A series of user interfaces will automate and intelligently guide the user to create or exploit models and assist the user to visualize domains of applicability, interpret results, and understand their limitations. The integrated platforms will enable scientists to seamlessly create, share and execute computational models leveraging private data vaults, with or without sharing the underlying training data. 3. Validate the platform by (a) developing a suite of at least five ADME/Tox and physicochemical property models based on open-source descriptors and data obtained from commercial ADME vendors, as well as public data from PubChem, ChEMBL and other sources, (b) securely making available a series of sophisticated pre- competitive ADME/Tox models provided by large pharmaceutical companies, and (c) demonstrating that col- laboratory can utilize the platform on their own (without relying on a computational chemist) to discover and advance TB drug leads with good ADME/Tox properties.

Public Health Relevance

The proposed project will create novel computational tools that will help researchers to accelerate the discovery of new and improved drugs against a wide range of diseases. These tools will particularly benefit researchers working on diseases that leading pharmaceutical companies have largely ignored because they are not perceived as highly profitable opportunities, despite the fact that in many cases they afflict millions of peopl.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Center for Advancing Translational Sciences (NCATS)
Type: Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #: 5R44TR000942-04
Application #: 8910529
Study Section: Special Emphasis Panel (ZRG1-IMST-G (10))
Program Officer: Colvis, Christine

Project Start: 2013-08-16
Project End: 2016-07-31
Budget Start: 2015-08-01
Budget End: 2016-07-31
Support Year: 4
Fiscal Year: 2015
Total Cost: $379,158
Indirect Cost

Institution

Name: Collaborative Drug Discovery, Inc.
Department
Type
DUNS #: 149823846

City: Burlingame
State: CA
Country: United States
Zip Code: 94010

Related projects


NIH 2018 R44 TR	Biocomputation across distributed private datasets to enhance drug discovery Bunin, Barry A. / Collaborative Drug Discovery, Inc.
NIH 2017 R44 TR	Biocomputation across distributed private datasets to enhance drug discovery Bunin, Barry A. / Collaborative Drug Discovery, Inc.	$750,457
NIH 2015 R44 TR	Biocomputation across distributed private datasets to enhance drug discovery Ekins, Sean / Collaborative Drug Discovery, Inc.	$379,158
NIH 2014 R44 TR	Biocomputation across distributed private datasets to enhance drug discovery Ekins, Sean / Collaborative Drug Discovery, Inc.	$562,173
NIH 2013 R44 TR	Biocomputation across distributed private datasets to enhance drug discovery Ekins, Sean / Collaborative Drug Discovery, Inc.	$462,944

Publications

Lane, Thomas; Russo, Daniel P; Zorn, Kimberley M et al. (2018) Comparing and Validating Machine Learning Models for Mycobacterium tuberculosis Drug Discovery. Mol Pharm 15:4346-4360

Ekins, Sean; Clark, Alex M; Dole, Krishna et al. (2018) Data Mining and Computational Modeling of High-Throughput Screening Datasets. Methods Mol Biol 1755:197-221

Stratton, Thomas P; Perryman, Alexander L; Vilchèze, Catherine et al. (2017) Addressing the Metabolic Stability of Antituberculars through Machine Learning. ACS Med Chem Lett 8:1099-1104

Mikušová, Katarína; Ekins, Sean (2017) Learning from the past for TB drug discovery in the future. Drug Discov Today 22:534-545

Ekins, Sean; Spektor, Anna Coulon; Clark, Alex M et al. (2017) Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB). Drug Discov Today 22:555-565

Perryman, Alexander L; Stratton, Thomas P; Ekins, Sean et al. (2016) Predicting Mouse Liver Microsomal Stability with ""Pruned"" Machine Learning Models and Public Data. Pharm Res 33:433-49

Clark, Alex M; Dole, Krishna; Ekins, Sean (2016) Open Source Bayesian Models. 3. Composite Models for Prediction of Binned Responses. J Chem Inf Model 56:275-85

Ekins, Sean; Perryman, Alexander L; Clark, Alex M et al. (2016) Machine Learning Model Analysis and Data Visualization with Small Molecules Tested in a Mouse Model of Mycobacterium tuberculosis Infection (2014-2015). J Chem Inf Model 56:1332-43

Ekins, Sean (2016) The Next Era: Deep Learning in Pharmaceutical Research. Pharm Res 33:2594-603

Clark, Alex M; Ekins, Sean (2015) Open Source Bayesian Models. 2. Mining a ""Big Dataset"" To Create and Validate Models with ChEMBL. J Chem Inf Model 55:1246-60

Showing the most recent 10 out of 17 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: