Collaborative Drug Discovery, Inc. (CDD) proposes to create an innovative software module that will help biologists to quickly and easily encode their plain-text biological assay protocols into formats suitable for computational processing. The software will enable scientists engaged in early stage drug discovery to automatically identify, sort and compare datasets across research groups, and to efficiently and properly document experimental procedures. In order to encourage adoption, the software will integrate seamlessly into preclinical data management platforms (such as CDD's), prioritize intuitive ease of use by scientists who are not informatics experts, harmonize with existing laboratory workflows, minimize the extra effort of annotation, and deliver clear and immediate benefits to the user as part of an integrated experience. This combination of new capabilities and extreme ease of use will accelerate translational drug discovery efforts by empowering software platforms that bridge the divide between biologists and medicinal chemists to apply sophisticated tools - long available on the chemistry side - for the first time also to the biological side, and thus across both domains. Existing software can already easily connect screening results to chemical structures. This new platform will further connect these data to the purpose and methodology of the screens.
Specific aims for Phase 2 are to: 1. Complete development of the novel annotation platform that interactively encodes assay protocols using an expressive ontology. The software components will be designed to be modular, flexible, robust and versatile so that they can also be incorporated into other types of platforms, such as websites, ELNs, and LIMS. 2. Train the software on a broad corpus of assay protocols (> 20,000) so it is ready for widespread use. 3. Develop the capability for the platform to train itself through continued use to (a) improve its performance for end users and (b) assist bioinformatics specialists to maintain and extend the underlying ontologies. 4. Document a quantitative improvement in annotation accuracy compared with fully automated approaches. 5. Deploy the core technology as a free, web-based service to scientists (e.g. in collaboration with PubChem). 6. Develop applications that utilize th core technology to deliver immediate benefits to end users (as described in more detail in Section I.B of the Research Strategy) and thereby promote adoption.
The proposed project will create novel computational tools that will help researchers to translate new experimental discoveries into the development of novel and improved drugs against a wide range of diseases. These tools will particularly benefit networks of researchers working on diseases that leading pharmaceutical companies have largely ignored because they are not perceived as highly profitable opportunities, despite the fact that in many cases they afflict millions of people.
Clark, Alex M; Bunin, Barry A; Litterman, Nadia K et al. (2014) Fast and accurate semantic annotation of bioassays exploiting a hybrid of machine learning and user confirmation. PeerJ 2:e524 |