The long-term aim of this project is to advance clinical care and biomedical research by establishing a natural language processing (NLP) resource for the biomedical community. A major bottleneck for development of automated tools for clinical applications and biomedical research is that most of the data and knowledge occur in the form of text, resulting in a lack of coded data. This NLP resource will make possible a host of automated applications by enabling high throughput access to coded biomedical knowledge and data. The foundation of this resource will be the MedLEE NLP system, which has been used operationally for almost a decade in healthcare settings for a broad range of applications that have proven to be valuable for clinical care. The NLP resource will also include BioMedLEE (a derivative of MedLEE), which encodes genotypic-phenotypic (GP) relations in the scientific literature. It currently focuses on GP relations associated with cancer and infectious diseases, and is being used to organize the extracted information to facilitate research, curation, and ontological development within model organism databases. This proposal will enable us to 1) disseminate our NLP resource to the community, 2) conduct technological research and development (R&D) to facilitate expansion and adaptation of the resource to new applications and specialties, 3) conduct R&D of tools that facilitate use of the extracted data and knowledge after coding, and 4) promote the resource, and provide service to users in the form of technical support, documentation, and tutorials. MedLEE and BioMedLEE are extendable systems that encompass the clinical and scientific communities. The dissemination of a proven NLP system that is applicable to the entire biomedical community provides an exceptional opportunity for multiple developers and researchers to work to unleash the true potential of NLP technology, increasing development of applications that aim to enhance scientific research and improve all levels of health.
Showing the most recent 10 out of 56 publications