Collaborative Drug Discovery, Inc. (CDD) proposes to extend its innovative software platform ? BioAssay Express (BAE, version 1.0) ? which helps biologists to quickly and easily encode their plain-text biological assay protocols into formats suitable for computational processing. In Phase 2B, CDD will further enhance BAE so that it can markup a bioassay with a comprehensive, detailed set of annotations that completely specifies the protocol, step by step. In its current state of development in Phase 2, BAE 1.0 enables scientists to summarize the English language text that specifies an experimental protocol, supplementing it with machine-interpretable descriptors. At the end of Phase 2B, BAE 2.0 will output a detailed machine- interpretable version that will be completely equivalent to, and can optionally substitute for, the traditional English language method description. The enhanced BAE 2.0 software will enable scientists engaged in early stage drug discovery to efficiently and accurately document experimental procedures and intelligently aggregate datasets across research groups. An important benefit will be facilitating improved access, use and reproducibility by (1) flagging differences between closely related assays; (2) correlating differences in protocols with differences in results; and (3) improving the reproducibility of experiments conducted in different laboratories, to highlight potential causes of divergences. The project will continue to leverage the related broad initiatives at NIH and elsewhere that are working to promote translational research, gather screening data via open repositories, and apply sophisticated ontologies to classify these datasets. Our project targets the intersection of these disparate initiatives and unifies them by making their complex standards and guidelines for praxis realistically accessible to researchers who want to focus on their scientific work To encourage adoption, the software will prioritize intuitive ease of use by scientists who are not informatics experts, harmonize with existing laboratory workflows, minimize the extra effort of annotation, integrate seamlessly into preclinical data management platforms (including but not limited to CDD?s own CDD Vault), and deliver clear and immediate benefits to the user as part of an integrated experience. This combination of new capabilities and extreme ease of use will accelerate translational drug discovery efforts by empowering software platforms that bridge the divide between biologists and medicinal chemists to apply sophisticated tools ? long available on the chemistry side ? for the first time also to the biological side, and thus across both domains. Existing software can already easily connect screening results to chemical structures. This new platform will further connect these data to the purpose and methodology of the screens.

Public Health Relevance

The proposed project will create novel computational tools that will help researchers to efficiently and accurately document experimental procedures, thereby facilitating improved access, use and reproducibility of data and assays. This fundamental capability ultimately accelerates translating new experimental discoveries into the development of novel and improved drugs against a wide range of diseases. These tools will particularly benefit networks of researchers working on diseases that leading pharmaceutical companies have largely ignored because they are not perceived as highly profitable opportunities, despite the fact that in many cases they afflict millions of people.

National Institute of Health (NIH)
National Center for Advancing Translational Sciences (NCATS)
Small Business Innovation Research Grants (SBIR) - Phase II (R44)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Colvis, Christine
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Collaborative Drug Discovery, Inc.
United States
Zip Code