Collaborative Drug Discovery, Inc. (CDD) proposes to develop a suite of software modules to enable scientists to unambiguously represent chemical mixtures in standard machine-readable formats, filling an urgent and widely-recognized need. Chemicals are typically formulated as mixtures. Recording and communicating infor- mation about chemical mixtures is essential for scientists and support staff in the pharmaceutical industry, in academia, in non-profit research organizations, in government, at specialty chemical vendors, and at commer- cial manufacturers to: ? discover, develop, formulate, manufacture and regulate drugs; ? manage reagent inventories; comply with laboratory safety requirements; inform first responders; ? describe and reproduce biomedical experiments; and ? assess and disseminate information about toxicity risks of chemical reagents and consumer products. A working committee of the International Union of Pure and Applied Chemistry (IUPAC) is close to for- malizing ?Mixtures InChI? (or MInChI), which will extend the International Chemical Identifier (InChI) to be- come the first standard to encompass mixtures. MInChI will effectively index mixtures in the same way that InChI indexes individual compounds. In Phase 1 CDD developed the data structures and software necessary to enable adoption and utilization of MInChI and create the first general-purpose system for recording information about chemical mixtures that is computable and interoperable. In Phase 2 CDD will continue to develop a sophisticated automated transla- tion tool that will accurately convert legacy catalogs of chemical mixtures from plaintext descriptions or ad hoc formats so that they are properly represented in a machine readable format that can in turn be easily rendered into MInChI identifiers. The broad vision is to help industry to overcome the barriers to adoption so that ma- chine readable mixture descriptions can quickly deliver benefits for drug discovery, chemical safety, and toxi- cology.
The proposed project will create novel computational tools that will help researchers to efficiently and accu- rately document the composition of chemical mixtures in a format that computers can easily interpret, process, and exchange. This innovative capability will help to accelerate the discovery and development of novel and improved drugs against a wide range of diseases. It will also help to advance our understanding of the toxicol- ogy of mixtures (which often differs from the toxicology of individual components) and improve laboratory safety both in industry and in educational settings. !