A key prerequisite for high-quality healthcare delivery is effective communication within and across healthcare settings. However, communication can be hampered by the pervasive use of abbreviations in clinical notes. Clinicians use abbreviations to save time during documentation. While abbreviations may seem unambiguous to their authors, they often cause confusion to other readers, including healthcare providers, patients, and natural language processing (NLP) systems attempting to extract clinical terms from text. While the understanding that abbreviations can cause errors is widespread, few have deployed pragmatic solutions for this important problem. The proposed project will develop, evaluate, and share a systematic approach to Clinical Abbreviation Recognition and Disambiguation (CARD), and in doing so substantially aims to benefit existing NLP systems and to improve computer-based documentation systems by reducing ambiguities in electronic records in real-time. The study includes the following five Specific Aims: 1) Develop automated methods to detect abbreviations and their senses from clinical text corpora and build a comprehensive knowledge base of clinical abbreviations;2) Develop and evaluate three automated word sense disambiguation (WSD) classifiers, and establish methods to combine those classifiers to maximize both their performance and coverage;3) Develop the CARD system, and demonstrate its effectiveness by integrating it with two established NLP systems (MedLEE and KnowledgeMap);4) Integrate CARD with an institutional clinical documentation system (Vanderbilt's StarNotes) and evaluate its ability to expand abbreviations in real-time as clinicians generate records;5) Distribute the CARD knowledge base and software for non-commercial uses.

Public Health Relevance

Abbreviations are widely used throughout all types of clinical documents and they cause confusion to both healthcare providers and patients and limit effective communications within and across care settings. This proposed study will develop informatics methods to automatically detect abbreviations and their possible meanings from large clinical text and to disambiguate abbreviations that have multiple meanings. We will also integrate those methods with clinical documentation systems so that abbreviations will be expanded in real-time when physicians entering clinical notes, thus to improve the quality of health records.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Project (R01)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
Vanderbilt University Medical Center
Internal Medicine/Medicine
Schools of Medicine
United States
Zip Code
Zhang, Yaoyun; Xu, Jun; Chen, Hui et al. (2016) Chemical named entity recognition in patents by domain knowledge and unsupervised feature learning. Database (Oxford) 2016:
Xu, Jun; Wu, Yonghui; Zhang, Yaoyun et al. (2016) CD-REST: a system for extracting chemical-induced disease relation in literature. Database (Oxford) 2016:
Zhang, Yaoyun; Soysal, Ergin; Moon, Sungrim et al. (2015) Integrating Multiple On-line Knowledge Bases for Disease-Lab Test Relation Extraction. AMIA Jt Summits Transl Sci Proc 2015:204-8
Wu, Yonghui; Xu, Jun; Jiang, Min et al. (2015) A Study of Neural Word Embeddings for Named Entity Recognition in Clinical Text. AMIA Annu Symp Proc 2015:1326-33
Wu, Yonghui; Jiang, Min; Lei, Jianbo et al. (2015) Named Entity Recognition in Chinese Clinical Text Using Deep Neural Network. Stud Health Technol Inform 216:624-8
Xu, Jun; Zhang, Yaoyun; Wu, Yonghui et al. (2015) Citation Sentiment Analysis in Clinical Trial Papers. AMIA Annu Symp Proc 2015:1334-41
Jiang, Min; Huang, Yang; Fan, Jung-wei et al. (2015) Parsing clinical text: how good are the state-of-the-art parsers? BMC Med Inform Decis Mak 15 Suppl 1:S2
Chen, Yukun; Lasko, Thomas A; Mei, Qiaozhu et al. (2015) A study of active learning methods for named entity recognition in clinical text. J Biomed Inform 58:11-8
Wu, Y; Denny, J C; Rosenbloom, S T et al. (2015) A Preliminary Study of Clinical Abbreviation Disambiguation in Real Time. Appl Clin Inform 6:364-74
Wu, Yonghui; Lei, Jianbo; Wei, Wei-Qi et al. (2013) Analyzing differences between chinese and english clinical text: a cross-institution comparison of discharge summaries in two languages. Stud Health Technol Inform 192:662-6

Showing the most recent 10 out of 20 publications