Temporal Processing for Medical Discharge Summaries

Pustejovsky, James

Abstract

The goals of our project are as follows: 1. Create a corpus of temporally annotated data. Under the supervision of our consultants Dr. Frank Sacks, Dr. Vincent Carey, and two Registered Nurses, we will create a gold-standard annotation of events and temporal information within patient narratives from de- identified Electronic Health Record data using the CLEF and TimeML guidelines. We will use the framework of the Brandeis Annotation Tool, a system we have designed to facilitate the quick construction of accurately annotated corpora against a specified guideline. Extensions to the current event library and lexicon with medical event references will be made during the annotation process, under the guidance of the Registered Nurses. 2. Adapt the TARSQI Toolkit (TTK) to targeted temporal properties and relations in the EHR domain. We will use the TARSQI toolkit, a robust set of temporal processing algorithms we have designed for parsing natural language text, to automatically annotate the events and temporal information in EHR data. Combined with the Brandeis AcroMed Medical Abbreviation Server and those terms introduced in part 1, we will employ the Specialist Lexicon and other medical resources to extend the toolkit capabilities for recognizing and interpreting medical event information. Algorithms for identifying events, temporal expressions, and event anchorings and orderings will be trained against the gold standard created in Aim 1, and tested against held-out data. 3. Create a cross-document temporal database of medical events. Using the recognition algorithms introduced in Aim 2, we will create a searchable, temporally ordered database of medical events such as diseases, symptoms, surgeries/interventions, and test results. Events referred to multiple times in the data will be merged using a constraint- satisfaction analysis in order to create a more coherent narrative for a single patient over multiple records.

Public Health Relevance

It is becoming increasingly common for medical researchers to use Electronic Health Records (EHRs) as a primary source of data for researching correlations between various medical issues and concepts. However, EHRs typically contain unstructured text, making them difficult to mine. This research will create a database of temporal orderings from events extracted from EHR patient narratives, using algorithms previously applied to news articles.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Exploratory/Developmental Grants (R21)
Project #: 5R21LM009633-02
Application #: 7941063
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Sim, Hua-Chuan

Project Start: 2009-09-30
Project End: 2012-09-29
Budget Start: 2010-09-30
Budget End: 2012-09-29
Support Year: 2
Fiscal Year: 2010
Total Cost: $175,973
Indirect Cost

Institution

Name: Brandeis University
Department: Biostatistics & Other Math Sci
Type: Schools of Arts and Sciences
DUNS #: 616845814

City: Waltham
State: MA
Country: United States
Zip Code: 02454

Related projects


NIH 2010 R21 LM	Temporal Processing for Medical Discharge Summaries Pustejovsky, James / Brandeis University	$175,973
NIH 2009 R21 LM	Temporal Processing for Medical Discharge Summaries Pustejovsky, James / Brandeis University	$177,750

Comments

Be the first to comment on James Pustejovsky's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: