The goal of this research is to develop new software tools for recording and classifying free-text descriptions of adverse reactions to new medical therapies. A major challenge is finding ways to accommodate the conflicting requirements of universality of coverage and uniformity of coding. A three-step model is proposed: extraction of linguistic terms, classification into an intermediate language vocabulary, and selection of codes from a codelist. The extraction step applies document retrieval techniques to assist the medical coder in the selection of relevant terms. Term subsets are classified into an intermediate language based upon the Unified Medical Language System (UMLS) Metathesaurus of the National Library of Medicine. Customizable coding modules can then be used to map intermediate language terms into output codes. In collaboration with experts in drug development and thesaurus construction, Phase I will evaluate the utility of this model, with particular focus upon the divergent goals of universality and uniformity. Phase II will produce a full software prototype and will explore integration with new document-retrieval-based approaches for increased automation. The major anticipated health-related contribution of this research is an enhanced ability to capture and analyze the all-important free-text information related to the safety profile of new therapies.
Improved techniques for capturing adverse reaction information in a form suitable for retrieval and analysis will have widespread appeal to organizations engaged in clinical testing of new therapies, including both pharmaceutical companies and academic and Government research centers. The founders of Belmont Research Inc. have a strong track record in the development, commercialization, and distribution of commercial software to support biomedical applications.