The long-term aim of this project is to advance clinical care and biomedical research by establishing a natural language processing (NLP) resource for the biomedical community. A major bottleneck for development of automated tools for clinical applications and biomedical research is that most of the data and knowledge occur in the form of text, resulting in a lack of coded data. This NLP resource will make possible a host of automated applications by enabling high throughput access to coded biomedical knowledge and data. The foundation of this resource will be the MedLEE NLP system, which has been used operationally for almost a decade in healthcare settings for a broad range of applications that have proven to be valuable for clinical care. The NLP resource will also include BioMedLEE (a derivative of MedLEE), which encodes genotypic-phenotypic (GP) relations in the scientific literature. It currently focuses on GP relations associated with cancer and infectious diseases, and is being used to organize the extracted information to facilitate research, curation, and ontological development within model organism databases. This proposal will enable us to 1) disseminate our NLP resource to the community, 2) conduct technological research and development (R&D) to facilitate expansion and adaptation of the resource to new applications and specialties, 3) conduct R&D of tools that facilitate use of the extracted data and knowledge after coding, and 4) promote the resource, and provide service to users in the form of technical support, documentation, and tutorials. MedLEE and BioMedLEE are extendable systems that encompass the clinical and scientific communities. The dissemination of a proven NLP system that is applicable to the entire biomedical community provides an exceptional opportunity for multiple developers and researchers to work to unleash the true potential of NLP technology, increasing development of applications that aim to enhance scientific research and improve all levels of health.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM008635-03
Application #
7257857
Study Section
Special Emphasis Panel (ZLM1-HS-R (J2))
Program Officer
Ye, Jane
Project Start
2005-07-01
Project End
2009-06-30
Budget Start
2007-07-01
Budget End
2008-06-30
Support Year
3
Fiscal Year
2007
Total Cost
$529,014
Indirect Cost
Name
Columbia University (N.Y.)
Department
Internal Medicine/Medicine
Type
Schools of Medicine
DUNS #
621889815
City
New York
State
NY
Country
United States
Zip Code
10032
Boyd, Andrew D; Dunn Lopez, Karen; Lugaresi, Camillo et al. (2018) Physician nurse care: A new use of UMLS to measure professional contribution: Are we talking about the same patient a new graph matching algorithm? Int J Med Inform 113:63-71
Yadav, Kabir; Sarioglu, Efsun; Choi, Hyeong Ah et al. (2016) Automated Outcome Classification of Computed Tomography Imaging Reports for Pediatric Traumatic Brain Injury. Acad Emerg Med 23:171-8
Salmasian, Hojjat; Tran, Tran H; Chase, Herbert S et al. (2015) Medication-indication knowledge bases: a systematic review and critical appraisal. J Am Med Inform Assoc 22:1261-70
Li, Ying; Salmasian, Hojjat; Vilar, Santiago et al. (2014) A method for controlling complex confounding effects in the detection of adverse drug reactions using electronic health records. J Am Med Inform Assoc 21:308-14
Vilar, Santiago; Uriarte, Eugenio; Santana, Lourdes et al. (2014) Similarity-based modeling in large-scale prediction of drug-drug interactions. Nat Protoc 9:2147-63
Salmasian, Hojjat; Tran, Tran H; Friedman, Carol (2014) Developing a formal representation for medication appropriateness criteria. AMIA Annu Symp Proc 2014:1911-9
Vilar, Santiago; Uriarte, Eugenio; Santana, Lourdes et al. (2014) State of the art and development of a drug-drug interaction large scale predictor based on 3D pharmacophoric similarity. Curr Drug Metab 15:490-501
Salmasian, Hojjat; Freedberg, Daniel E; Friedman, Carol (2013) Deriving comorbidities from medical records using natural language processing. J Am Med Inform Assoc 20:e239-42
Liu, X Sherry; Wang, Ji; Zhou, Bin et al. (2013) Fast trabecular bone strength predictions of HR-pQCT and individual trabeculae segmentation-based plate and rod finite element model discriminate postmenopausal vertebral fractures. J Bone Miner Res 28:1666-78
Yadav, Kabir; Sarioglu, Efsun; Smith, Meaghan et al. (2013) Automated outcome classification of emergency department computed tomography imaging reports. Acad Emerg Med 20:848-54

Showing the most recent 10 out of 56 publications