Automatic  discovery and processing of EEG cohorts from clinical records

Harabagiu, Sanda; Obeid, Iyad; Picone, Joseph

Abstract

Electronic medical records (EMRs) collected at every hospital in the country collectively contain a staggering wealth of biomedical knowledge. EMRs can include unstructured text, temporally constrained measurements (e.g., vital signs), multichannel signal data (e.g., EEGs), and image data (e.g., MRIs). This information could be transformative if properly harnessed. Information about patient medical problems, treatments, and clinical course is essential for conducting comparative effectiveness research. Uncovering clinical knowledge that enables comparative research is the primary goal of this proposal. We will focus on the automatic interpretation of clinical EEGs collected over 12 years at Temple University Hospital (over 25,000 sessions and 15,000 patients). Clinicians will be able to retrieve relevant EEG signals and EEG reports using standard queries (e.g. Young patients with focal cerebral dysfunction who were treated with Topamax).
In Aim 1 we will automatically annotate EEG events that contribute to a diagnosis. We will develop automated techniques to discover and time-align the underlying EEG events using semi-supervised learning.
In Aim 2 we will process the text from the EEG reports using state-of-the-art clinical language processing techniques. Clinical concepts, their type, polarity and modality shall be discovered automatically, as well as spatial and temporal information. In addition, we shall extract the medical concepts describing the clinical picture of patients from the EEG reports.
In Aim 3, we will develop a patient cohort retrieval system that will operate on the clinical knowledge extracted in Aims 1 and 2. In addition we shall organize this knowledge in a unified representation: the Qualified Medical Knowledge Graph (QMKG), which will be built using BigData solutions through MapReduce. The QMKG will be able to be searched by biomedical researchers as well as practicing clinicians. The QMKG will also provide a characterization of the way in which events in an EEG are narrated by physicians and the validation of these across a BigData resource. The EMKG represents an important contribution to basic science.
In Aim 4 we will validate the usefulness of the patient cohort identification system by collecting feedback from clinicians and medical students who will participate in a rigorous evaluation protocol. Inclusion and exclusion criteria for the queries shall be designed and experts will provide relevance judgments for the results. For each query, medical experts shall examine the top-ranked cohorts for common precision errors (false positives) and the bottom five ranked common recall errors (false negatives). User validation testing will be performed using live clinical data and the feedback wil enhance the quality of the cohort identification system. The existence of an annotated BigData archive of EEGs will greatly increase accessibility for non- experts in neuroscience, bioengineering and medical informatics who would like to study EEG data. The creation of this resource through the development of efficient automated data wrangling techniques will demonstrate that a much wider range of BigData bioengineering applications are now tractable.

Public Health Relevance

The primary goal of this proposal is to enable comparative research by automatically uncovering clinical knowledge from a vast BigData archive of clinical EEG signals and EEG reports collected over the past 12 years at Temple University Hospital. In the proposed project, we will develop a proof-of-concept based on the discovery of patient cohorts and provide an annotated BigData archive as well as the software that enabled the annotations and the generation of the patient cohort retrieval system. This resource will be accompanied by a novel medical knowledge representation generated with MapReduce, greatly increasing accessibility for non- experts in neuroscience, bioengineering and medical informatics, and demonstrating the transformative potential of mining the staggering wealth of biomedical knowledge available in hospital medical records.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Human Genome Research Institute (NHGRI)
Type: Research Project--Cooperative Agreements (U01)
Project #: 5U01HG008468-02
Application #: 9069903
Study Section: Special Emphasis Panel (ZRG1)
Program Officer: Sofia, Heidi J

Project Start: 2015-06-01
Project End: 2018-05-31
Budget Start: 2016-06-01
Budget End: 2017-05-31
Support Year: 2
Fiscal Year: 2016
Total Cost
Indirect Cost

Institution

Name: Temple University
Department: Engineering (All Types)
Type: Biomed Engr/Col Engr/Engr Sta
DUNS #: 057123192

City: Philadelphia
State: PA
Country: United States
Zip Code: 19122

Related projects


NIH 2017 U01 HG	Automatic discovery and processing of EEG cohorts from clinical records Picone, Joseph; Harabagiu, Sanda Maria; Obeid, Iyad / Temple University	$435,047
NIH 2016 U01 HG	Automatic discovery and processing of EEG cohorts from clinical records Harabagiu, Sanda Maria; Obeid, Iyad; Picone, Joseph / Temple University
NIH 2016 U01 HG	Scalable EEG interpretation using Deep Learning and Schema Descriptors Picone, Joseph; Harabagiu, Sanda Maria; Obeid, Iyad / Temple University	$374,745
NIH 2015 U01 HG	Automatic discovery and processing of EEG cohorts from clinical records Harabagiu, Sanda Maria; Obeid, Iyad; Picone, Joseph / Temple University

Publications

Maldonado, Ramon; Goodwin, Travis R; Harabagiu, Sanda M (2018) Memory-Augmented Active Deep Learning for Identifying Relations Between Distant Medical Concepts in Electroencephalography Reports. AMIA Jt Summits Transl Sci Proc 2017:156-165

Goodwin, Travis R; Skinner, Michael A; Harabagiu, Sanda M (2018) Automatically Linking Registered Clinical Trials to their Published Results with Deep Highway Networks. AMIA Jt Summits Transl Sci Proc 2017:54-63

Maldonado, Ramon; Goodwin, Travis R; Skinner, Michael A et al. (2017) Deep Learning Meets Biomedical Ontologies: Knowledge Embeddings for Epilepsy. AMIA Annu Symp Proc 2017:1233-1242

Goodwin, Travis R; Maldonado, Ramon; Harabagiu, Sanda M (2017) Automatic recognition of symptom severity from psychiatric evaluation records. J Biomed Inform 75S:S71-S84

Goodwin, Travis R; Harabagiu, Sanda M (2017) Inferring Clinical Correlations from EEG Reports with Deep Neural Learning. AMIA Annu Symp Proc 2017:770-779

Yang, S; López, S; Golmohammadi, M et al. (2016) SEMI-AUTOMATED ANNOTATION OF SIGNAL EVENTS IN CLINICAL EEG DATA. IEEE Signal Process Med Biol Symp 2016:

Goodwin, Travis R; Harabagiu, Sanda M (2016) Multi-modal Patient Cohort Identification from EEG Report and Signal Data. AMIA Annu Symp Proc 2016:1794-1803

Goodwin, Travis R; Harabagiu, Sanda M (2016) Medical Question Answering for Clinical Decision Support. Proc ACM Int Conf Inf Knowl Manag 2016:297-306

Goodwin, Travis; Harabagiu, Sanda M (2016) Inferring the Interactions of Risk Factors from EHRs. AMIA Jt Summits Transl Sci Proc 2016:78-87

López, S; Gross, A; Yang, S et al. (2016) AN ANALYSIS OF TWO COMMON REFERENCE POINTS FOR EEGS. IEEE Signal Process Med Biol Symp 2016:

Showing the most recent 10 out of 14 publications

Comments

Be the first to comment on Sanda Harabagiu's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: