The advent of target-based drug discovery, assisted by advances in x-ray crystallography and NMR, has left pharmaceutical companies suffering both from too much and too little information: too much information in the sense that the early push for target discovery has left pharma's R&D drowning in potentially relevant targets and drug candidates;too little information in the sense that even cutting-edge in silico methods of compound screening fail to produce target-to-compound relational and trending data that could assist in the selection of promising future candidates. At present, the process of taking a candidate compound from a hit to a lead takes years and costs tens of millions of dollars. Early pharmaceutical R&D is in need of way to streamline this phase of development, a mathematically and biologically sound method for optimizing and ranking candidate compounds and their relationships to promising biological targets. The VaSSA technology, developed by Dr. Jeffrey Clark of Bioinformatica, LLC, and completed with the assistance of Dr. Gerald Wyckoff of the University of Missouri, has the potential to solve this problem. VaSSA is the implementation of an entirely novel approach to biological data. It measures information content, an independent variable that permits rigorous statistical analysis of nucleotide and amino acid data. VaSSA has proven successful at optimizing and ranking biologically relevant targets through information content analysis alone. At present, it is able to measure information content in nucleotide and amino acid data;however, we believe that an existing syntax for examining peptide data can be modified to measure the information content of organic chemical compounds. The goal of this study, therefore, is to develop a cheminformatic module for the VaSSA software that will play a critical role in the drug discovery cycle. Specifically, cheminformatic analysis, particularly as applied to target-to-compound relationships and the trends in that data, could vastly improve candidate identification and shorten the hits-to-leads cycle. The project's specific aims are to: (1) develop a cheminformatic module for VaSSA;(2) validate the cheminformatic syntax using existing VaSSA peptide syntax;(3) rank a set of 250 organic molecules based on information content analysis and validate results;and (4) develop an industry partnership to assist in further development of the technology. Dr. Wyckoff, in partnership with a programming consultant, will complete the module build and validation, as well as assisting Bioinformatica LLC in securing an appropriate industry partner and advising on future applications for the module, potentially including its application to inorganic modules and its role in the lead optimization phase of the drug discovery cycle.

Public Health Relevance

We propose the creation of a cheminformatic module as an extension to the VaSSA software that is already in production. This software is meant to extend the range of services that Bioinformatica can provide by allowing for the integration of information theory in to the drug development cycle.

National Institute of Health (NIH)
National Institute of General Medical Sciences (NIGMS)
Small Business Technology Transfer (STTR) Grants - Phase I (R41)
Project #
Application #
Study Section
Special Emphasis Panel (ZRG1-IMST-H (14))
Program Officer
Lyster, Peter
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Bioinformatica, LLC
United States
Zip Code
Yang, Ming; Wyckoff, Gerald J (2011) Detection of selection utilizing molecular phylogenetics: a possible approach. Genetica 139:639-48