Patient Medical History Representation, Extraction, and Inference from EHR Data

Tao, Cui

Abstract

The significance of developing tools for automatically harvesting temporal constraints of clinical events from Electronic Health Records (EHR) cannot be overestimated. Efficient analysis of the temporal aspects in EHR data could boost an array of clinical and translational research such as disease progression studies, decision support systems, and personalized medicine. One big challenge we are facing is to automatically untangle and linearize the temporal constraints of clinical events embedded in highly diverse large-scale EHR data. Barriers to temporal data modeling, normalization, extraction, and reasoning have precluded the efficient use of EHR data sources for event history evaluation and trending analysis: (1) The current federally-supported EHR data normalization tools do not focus on the time aspect of unstructured data yet; (2) Existing time models focus only on structured data with absolute time, lack of supporting reasoning systems, or only offer application-specific partial solutions which cannot be adopted by the complex EHR data; (3) Current temporal information extraction approaches are either difficult to be adopted to EHR data, not scalable, or only offers application-specific partial solution. This proposed project fills in the current gaps among ontologies, Natural Language Processing (NLP), and EHR-based clinical research for temporal data representation, normalization, extractions, and reasoning. We propose to develop novel approaches for automatic temporal data representation, normalization and reasoning for large, diverse, and heterogeneous EHR data and prepare the integrated data for further analysis. We will build new reasoning and extraction capacities on our TIMER (Temporal Information Modeling, Extracting, and Reasoning) framework to provide an end-to-end, open-source, standard-conforming software package. TIMER will be built on strong prior work by our team. We will develop new features in our CNTRO (Clinical Narrative Temporal Relation Ontology) for semantically defining the time domain and representing temporal data in complex EHR data. On top of the new developed CNTRO semantics, we will implement temporal relation reasoning capacities to automatically normalize temporal expressions, compute and infer temporal relations, and resolve ambiguities. We will leverage existing NLP tools and work on top of these tools to develop new extraction approaches to fill in the current gaps between NLP approaches and ontology-based reasoning approaches. We will adapt the SHARPn EHR data normalization pipeline and cTAKES for extracting and normalizing clinical event mentions from clinical narratives. We will explore an innovative approach for temporal relation extraction and event coreference, and make it work with the TIMER framework. We will evaluate the system using Diabetes Mellitus (DM) and colorectal cancer (CRC) patient cohorts from two insititutions. Each component will be tested separately first followed by an evaluation of the whole framework. Results such as precision, recall, and f-measure will be reported.

Public Health Relevance

The significance of developing capabilities for automatically harvesting temporal constraints for clinical events from Electronic Health Records (EHR) cannot be overestimated. A substantial portion of the information in the EHR is historical in nature. Patient medical history can be long, especially in complex patients. The proposed work, by offering an end-to-end open-source framework for automatically extracting, normalizing, and reasoning clinically-important time-relevant information from large-scale EHR data, can boost an array of clinical and translational research such as disease progression studies, decision support systems, and personalized medicine; as well as facilitate clinical practice for early disease detection, post-treatment care, and patient-clinician communication.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 5R01LM011829-04
Application #: 9332464
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Sim, Hua-Chuan

Project Start: 2014-09-01
Project End: 2019-08-31
Budget Start: 2017-09-01
Budget End: 2018-08-31
Support Year: 4
Fiscal Year: 2017
Total Cost
Indirect Cost

Institution

Name: University of Texas Health Science Center Houston
Department
Type: Sch Allied Health Professions
DUNS #: 800771594

City: Houston
State: TX
Country: United States
Zip Code: 77030

Related projects


NIH 2018 R01 LM	Patient Medical History Representation, Extraction, and Inference from EHR Data Tao, Cui / University of Texas Health Science Center Houston
NIH 2017 R01 LM	Patient Medical History Representation, Extraction, and Inference from EHR Data Tao, Cui / University of Texas Health Science Center Houston
NIH 2016 R01 LM	Patient Medical History Representation, Extraction, and Inference from EHR Data Tao, Cui / University of Texas Health Science Center Houston
NIH 2015 R01 LM	Patient Medical History Representation, Extraction, and Inference from EHR Data Tao, Cui / University of Texas Health Science Center Houston
NIH 2014 R01 LM	Patient Medical History Representation, Extraction, and Inference from EHR Data Tao, Cui / University of Texas Health Science Center Houston

Publications

Amith, Muhammad; Tao, Cui (2018) Representing vaccine misinformation using ontologies. J Biomed Semantics 9:22

Huang, Jing; Du, Jingcheng; Duan, Rui et al. (2018) Characterization of the Differential Adverse Event Rates by Race/Ethnicity Groups for HPV Vaccine by Integrating Data From Different Sources. Front Pharmacol 9:539

Wang, Yanshan; Liu, Sijia; Afzal, Naveed et al. (2018) A comparison of word embeddings for the biomedical natural language processing. J Biomed Inform 87:12-20

Brusco, Lauren L; Wathoo, Chetna; Mills Shaw, Kenna R et al. (2018) Physician interpretation of genomic test results and treatment selection. Cancer 124:966-972

Wang, Liwei; Rastegar-Mojarad, Majid; Ji, Zhiliang et al. (2018) Detecting Pharmacovigilance Signals Combining Electronic Medical Records With Spontaneous Reports: A Case Study of Conventional Disease-Modifying Antirheumatic Drugs for Rheumatoid Arthritis. Front Pharmacol 9:875

Amith, Muhammad; He, Zhe; Bian, Jiang et al. (2018) Assessing the practice of biomedical ontology evaluation: Gaps and opportunities. J Biomed Inform 80:1-13

Amith, Muhammad; Cunningham, Rachel; Savas, Lara S et al. (2017) Using Pathfinder networks to discover alignment between expert and consumer conceptual knowledge from online vaccine content. J Biomed Inform 74:33-45

Liu, Sijia; Wang, Liwei; Ihrke, Donna et al. (2017) Correlating Lab Test Results in Clinical Notes with Structured Lab Data: A Case Study in HbA1c and Glucose. AMIA Jt Summits Transl Sci Proc 2017:221-228

Ravikumar, K E; Rastegar-Mojarad, Majid; Liu, Hongfang (2017) BELMiner: adapting a rule-based relation extraction system to extract biological expression language statements from bio-medical literature evidence sentences. Database (Oxford) 2017:

Du, Jingcheng; Cai, Yi; Chen, Yong et al. (2017) Analysis of Individual Differences in Vaccine Pharmacovigilance Using VAERS Data and MedDRA System Organ Classes: A Use Case Study With Trivalent Influenza Vaccine. Biomed Inform Insights 9:1178222617700627

Showing the most recent 10 out of 34 publications

Comments

Be the first to comment on this grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: