Natural language processing for clinical and translational research

Liu, Hongfang; Pakhomov, Serguei; Xu, Hua

Abstract

Rapid growth in the clinical implementation of large electronic medical records (EMRs) has led to an unprecedented expansion in the availability of dense longitudinal datasets for clinical and translational research. This growth is being fueled by recent federal legislation that provides generous financial incentives to institutions demonstrating aggressive application and meaningful use of comprehensive EMRs. Efforts are already underway to link these EMRs across institutions, and standardize the definition of phenotypes for large scale studies of disease onset and treatment outcome, specifically within the context of routine clinical care. However, a well-known challenge for secondary use of EMR data for clinical and translational research is that much of detailed patient information is embedded in narrative text. Natural Language Processing (NLP) technologies, which are able to convert unstructured clinical text into coded data, have been introduced into the biomedical domain and have demonstrated promising results. Researchers have used NLP systems to identify clinical syndromes and common biomedical concepts from radiology reports, discharge summaries, problem lists, nursing documentation, and medical education documents. Different NLP systems have been developed at different institutions and utilized to convert clinical narrative text into structured data that may be used for other clinical applications and studies. Successful stories in applying NLP to clinical and translational research have been reported widely. However, institutions often deploy different NLP systems, which produce various types of output formats and make it difficult to exchange information between sites. Therefore, the lack of interoperability among different clinical NLP systems becomes a bottleneck for efficient multi-site studies. In addition, many successful studies often require a strong interdisciplinary team where informaticians and clinicians have to work very closely to iteratively define optimal algorithms for clinical phenotypes. As intensive informatics support may not be available to every clinical researcher, the usability of NLP systems for end users is another important issue. The proposed project builds upon first-hand knowledge and experience across the research team in the use of NLP for clinical and translational research projects. There are several big informatics initiatives for clinical and translational research but those initiatives generally assume one shoe fits all and follow top-down approaches to develop NLP solutions. Complementary to those initiatives, we will use a bottom-up approach to handle interoperability and usability: i) we will obtain a common NLP data model and exchange format through empirical analysis of existing NLP systems and NLP results; ii) we will develop a user-centric NLP front end interface for NLP systems wrapped to be consistent with the proposed NLP data model and exchange format incorporating usability analysis into the agile development process. All deliverables will be distributed through the open health NLP (OHNLP) consortium which we intend to make it more open and inclusive.

Public Health Relevance

Rapid growth in the clinical implementation of large electronic medical records (EMRs) has led to an unprecedented expansion in the availability of dense longitudinal datasets for clinical and translational research. We propose the development of a novel framework to enable the use of clinical information embedded in clinical narratives for clinical and translational research.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Institute of General Medical Sciences (NIGMS)
Type: Research Project (R01)
Project #: 4R01GM102282-04
Application #: 9033918
Study Section: Biomedical Computing and Health Informatics Study Section (BCHI)
Program Officer: Marcus, Stephen

Project Start: 2013-04-01
Project End: 2017-03-31
Budget Start: 2016-04-01
Budget End: 2017-03-31
Support Year: 4
Fiscal Year: 2016
Total Cost
Indirect Cost

Institution

Name: Mayo Clinic, Rochester
Department
Type
DUNS #: 006471700

City: Rochester
State: MN
Country: United States
Zip Code: 55905

Related projects


NIH 2016 R01 GM	Natural language processing for clinical and translational research Liu, Hongfang; Pakhomov, Serguei V S.; Xu, Hua / Mayo Clinic, Rochester
NIH 2015 R01 GM	Natural language processing for clinical and translational research Liu, Hongfang; Pakhomov, Serguei V S.; Xu, Hua / Mayo Clinic, Rochester
NIH 2014 R01 GM	Natural language processing for clinical and translational research Liu, Hongfang; Pakhomov, Serguei V S.; Xu, Hua / Mayo Clinic, Rochester	$580,082
NIH 2014 R01 GM	Natural language processing for clinical and translational research Liu, Hongfang; Pakhomov, Serguei V S.; Xu, Hua / Mayo Clinic, Rochester	$160,000
NIH 2013 R01 GM	Natural language processing for clinical and translational research Liu, Hongfang; Pakhomov, Serguei V S.; Xu, Hua / Mayo Clinic, Rochester	$630,706

Publications

Wang, Yanshan; Liu, Sijia; Afzal, Naveed et al. (2018) A comparison of word embeddings for the biomedical natural language processing. J Biomed Inform 87:12-20

Afzal, Naveed; Mallipeddi, Vishnu Priya; Sohn, Sunghwan et al. (2018) Natural language processing of clinical notes for identification of critical limb ischemia. Int J Med Inform 111:83-89

Chaudhry, Alisha P; Afzal, Naveed; Abidian, Mohamed M et al. (2018) Innovative Informatics Approaches for Peripheral Artery Disease: Current State and Provider Survey of Strategies for Improving Guideline-Based Care. Mayo Clin Proc Innov Qual Outcomes 2:129-136

Wang, Liwei; Rastegar-Mojarad, Majid; Ji, Zhiliang et al. (2018) Detecting Pharmacovigilance Signals Combining Electronic Medical Records With Spontaneous Reports: A Case Study of Conventional Disease-Modifying Antirheumatic Drugs for Rheumatoid Arthritis. Front Pharmacol 9:875

Hultman, Gretchen; McEwan, Reed; Pakhomov, Serguei et al. (2018) Usability Evaluation of an Unstructured Clinical Document Query Tool for Researchers. AMIA Jt Summits Transl Sci Proc 2017:84-93

Lee, Hee-Jin; Wu, Yonghui; Zhang, Yaoyun et al. (2017) A hybrid approach to automatic de-identification of psychiatric notes. J Biomed Inform 75S:S19-S27

Shen, Feichen; Liu, Sijia; Wang, Yanshan et al. (2017) Leveraging Collaborative Filtering to Accelerate Rare Disease Diagnosis. AMIA Annu Symp Proc 2017:1554-1563

Sohn, Sunghwan; Wi, Chung-Il; Juhn, Young J et al. (2017) Analysis of Clinical Variations in Asthma Care Documented in Electronic Health Records Between Staff and Resident Physicians. Stud Health Technol Inform 245:1170-1174

Zhang, Yaoyun; Zhang, Olivia; Wu, Yonghui et al. (2017) Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. J Biomed Inform 75S:S129-S137

Moon, Sungrim; Ihrke, Donna; Zeng, Yuqun et al. (2017) Distinction between medical and non-medical usages of short forms in clinical narratives. AMIA Annu Symp Proc 2017:1302-1311

Showing the most recent 10 out of 96 publications

Comments

Be the first to comment on Hongfang Liu's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: