Natural Language Question Understanding for Electronic Health Records

Roberts, Kirk

Abstract

Patient information in the electronic health record (EHR) such as lab results, medications, and past medical history is the basis for physician decisions about patient care. It also helps patients better understand and manage their care. Efficient access to this patient information is thus essential. One of the most intuitive ways of accessing data is by asking natural language questions. A significant amount of work in medical question answering has been conducted, yet little work has been performed in question answering for EHRs. Natural language questions can be represented in logical forms, a standard structured knowledge representation technique. This project proposes to take natural language EHR questions, both for doctors and patients, and automatically convert them to a logical form. The logical forms can then be converted to a structured query such as those used by EHRs. A major obstacle to this approach is the lack of data containing questions annotated with logical forms. This project hypothesizes that a small set of questions can be manually annotated, and then paraphrases can be produced for each annotated question. Since paraphrasing is a simpler task than logical form annotation, crowd-sourcing techniques can be used to collect thousands of question paraphrases. This question paraphrase corpus will then be used to build a semantic grammar capable of recognizing the logical structure of EHR questions. To ensure a robust, generalizable grammar, existing NLP techniques will be used to pre-process questions, simplifying their syntactic structure and abstracting their medical concepts. In order to develop such a method, the candidate, Dr. Kirk Roberts, requires additional training and mentoring in natural language processing and biomedical informatics. This application for the NIH Pathway to Independence Award (K99/R00) describes a career development plan that will allow Dr. Roberts to achieve the goals of this project as well as transition to a career as an independent researcher. He will be mentored by Dr. Dina Demner-Fushman, a leading medical NLP researcher, and co-mentored by Dr. Clement McDonald, a leading EHR and medical informatics researcher.
The specific aims of the project are: (1) Build a paraphrase collection of EHR questions, where each prototype question will have many unique paraphrases. The paraphrases encompass different lexical and syntactic means of conveying the same logical form. (2) Construct a semantic grammar for EHR questions. The grammar can then be used to convert a natural language question to a logical form. (3) Implement an end- to-end question analyzer that generalizes EHR questions for improved parsing, parses the question into a logical form using the grammar, and converts the logical form into a leading structured EHR query format.

Public Health Relevance

The proposed work aims to significantly improve the ability of both doctors and patients to find information within electronic health records (EHR). By providing an interface to EHRs where users can specify their information needs in the form of a natural language question, the proposed work provides a more intuitive means of finding patient data than is currently available.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Transition Award (R00)
Project #: 5R00LM012104-04
Application #: 9479293
Study Section: Special Emphasis Panel (NSS)
Program Officer: Vanbiervliet, Alan

Project Start: 2016-05-01
Project End: 2019-04-30
Budget Start: 2018-05-01
Budget End: 2019-04-30
Support Year: 4
Fiscal Year: 2018
Total Cost
Indirect Cost

Institution

Name: University of Texas Health Science Center Houston
Department
Type: Sch Allied Health Professions
DUNS #: 800771594

City: Houston
State: TX
Country: United States
Zip Code: 77030

Related projects


NIH 2018 R00 LM	Natural Language Question Understanding for Electronic Health Records Roberts, Kirk Edward / University of Texas Health Science Center Houston
NIH 2017 R00 LM	Natural Language Question Understanding for Electronic Health Records Roberts, Kirk Edward / University of Texas Health Science Center Houston
NIH 2016 R00 LM	Natural Language Question Understanding for Electronic Health Records Roberts, Kirk Edward / University of Texas Health Science Center Houston

Publications

Demner-Fushman, Dina; Shooshan, Sonya E; Rodriguez, Laritza et al. (2018) A dataset of 200 structured product labels annotated for adverse drug reactions. Sci Data 5:180001

Zhang, Yaoyun; Li, Hee-Jin; Wang, Jingqi et al. (2018) Adapting Word Embeddings from Multiple Domains to Symptom Recognition from Psychiatric Notes. AMIA Jt Summits Transl Sci Proc 2017:281-289

Zhang, Yaoyun; Zhang, Olivia; Wu, Yonghui et al. (2017) Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge. J Biomed Inform 75S:S129-S137

Lee, Hee-Jin; Zhang, Yaoyun; Roberts, Kirk et al. (2017) Leveraging existing corpora for de-identification of psychiatric notes using domain adaptation. AMIA Annu Symp Proc 2017:1070-1079

Lee, Hee-Jin; Wu, Yonghui; Zhang, Yaoyun et al. (2017) A hybrid approach to automatic de-identification of psychiatric notes. J Biomed Inform 75S:S19-S27

Mrabet, Yassine; Kilicoglu, Halil; Roberts, Kirk et al. (2016) Combining Open-domain and Biomedical Knowledge for Topic Recognition in Consumer Health Questions. AMIA Annu Symp Proc 2016:914-923

Roberts, Kirk; Demner-Fushman, Dina (2016) Annotating Logical Forms for EHR Questions. LREC Int Conf Lang Resour Eval 2016:3772-3778

Roberts, Kirk; Rodriguez, Laritza; Shooshan, Sonya E et al. (2016) Resource Classification for Medical Questions. AMIA Annu Symp Proc 2016:1040-1049

Roberts, Kirk; Demner-Fushman, Dina (2016) Interactive use of online health resources: a comparison of consumer and professional questions. J Am Med Inform Assoc 23:802-11

Comments

Be the first to comment on Kirk Roberts's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: