Challenges in Natural Language Processing for Clinical Narratives

Uzuner, Ozlem

Abstract

Narratives of electronic health records (EHRs) contain useful information that is difficult to automatically extract, index, search, or interpret. Clinical natural language processing (NLP) technologies for automatic extraction, indexing, searching, and interpretation of EHRs are in development;however, due to privacy concerns related to EHRs, such technologies are usually developed by teams that have privileged access to EHRs in a specific institution. Technologies that are tailored to a specific set of data from a given institution generate inspiring results on that data;however, they can fail to generalize to similar data from other institutions and even other departments from the same institution. Therefore, learning from these technologies and building on them becomes difficult. In order to improve NLP in EHRs, there is need for head-to-head comparison of approaches that can address a given task on the same data set. Shared-tasks provide one way of conducting systematic head-to- head comparisons. This proposal describes a series of shared-task challenges and conferences, spread over a five year period, that promote the development and evaluation of cutting edge clinical NLP systems by distributing de-identified EHRs to the broad research community, under data use agreements, so that: * the state-of-the-art in clinical NLP technologies can be identified and advanced, * a set of technologies that enable the use of the information contained in EHR narratives becomes available, and * the information from EHR narratives can be made more accessible, for example, for clinical and medical research. The scientific activities supporting the organization of the shared-task challenges are sponsored in part by Informatics for Integrating Biology and the Bedside (i2b2), grant number U54-LM008748, PI: Kohane. This proposal aims to organize a series of workshops, conference proceedings, and journal special issues that will accompany the shared-task challenges in order to disseminate the knowledge generated by the challenges.

Public Health Relevance

this proposal will address two main challenges related to the use of clinical narratives for research: availability of clinical records for research and identification of the state of the art in clinical natural language processing (NLP) technologies so that we can push the state of the art forward and so that future work can build on the past. Progress in clinical NLP will improve access to electronic health records for research, and for clinical applications, benefiting healthcare and public health.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Conference (R13)
Project #: 1R13LM011411-01
Application #: 8400218
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Sim, Hua-Chuan

Project Start: 2012-09-01
Project End: 2017-08-31
Budget Start: 2012-09-01
Budget End: 2013-08-31
Support Year: 1
Fiscal Year: 2012
Total Cost: $20,000
Indirect Cost

Institution

Name: State University of New York at Albany
Department
Type: Schools of Arts and Sciences
DUNS #: 152652822

City: Albany
State: NY
Country: United States
Zip Code: 12222

Related projects


NIH 2016 R13 LM	Challenges in Natural Language Processing for Clinical Narratives Uzuner, Ozlem / State University of New York at Albany	$20,000
NIH 2015 R13 LM	Challenges in Natural Language Processing for Clinical Narratives Uzuner, Ozlem / State University of New York at Albany
NIH 2014 R13 LM	Challenges in Natural Language Processing for Clinical Narratives Uzuner, Ozlem / State University of New York at Albany
NIH 2013 R13 LM	Challenges in Natural Language Processing for Clinical Narratives Uzuner, Ozlem / State University of New York at Albany	$18,400
NIH 2012 R13 LM	Challenges in Natural Language Processing for Clinical Narratives Uzuner, Ozlem / State University of New York at Albany	$20,000

Publications

Singh, Vivek Kumar; Shrivastava, Utkarsh; Bouayad, Lina et al. (2018) Machine learning for psychiatric patient triaging: an investigation of cascading classifiers. J Am Med Inform Assoc 25:1481-1487

Karystianis, George; Nevado, Alejo J; Kim, Chi-Hun et al. (2018) Automatic mining of symptom severity from psychiatric evaluation notes. Int J Methods Psychiatr Res 27:

Jiang, Zhipeng; Zhao, Chao; He, Bin et al. (2017) De-identification of medical records using conditional random fields and long short-term memory networks. J Biomed Inform 75S:S43-S53

Scheurwegs, Elyne; Sushil, Madhumita; Tulkens, Stéphan et al. (2017) Counting trees in Random Forests: Predicting symptom severity in psychiatric intake reports. J Biomed Inform 75S:S112-S119

Bui, Duy Duc An; Wyatt, Mathew; Cimino, James J (2017) The UAB Informatics Institute and 2016 CEGS N-GRID de-identification shared task challenge. J Biomed Inform 75S:S54-S61

Goodwin, Travis R; Maldonado, Ramon; Harabagiu, Sanda M (2017) Automatic recognition of symptom severity from psychiatric evaluation records. J Biomed Inform 75S:S71-S84

Liu, Zengjian; Tang, Buzhou; Wang, Xiaolong et al. (2017) De-identification of clinical notes via recurrent neural network and conditional random field. J Biomed Inform 75S:S34-S42

Uzuner, Özlem; Stubbs, Amber; Filannino, Michele (2017) A natural language processing challenge for clinical records: Research Domains Criteria (RDoC) for psychiatry. J Biomed Inform 75S:S1-S3

Dehghan, Azad; Kovacevic, Aleksandar; Karystianis, George et al. (2017) Learning to identify Protected Health Information by integrating knowledge- and data-driven algorithms: A case study on psychiatric evaluation notes. J Biomed Inform 75S:S28-S33

Rios, Anthony; Kavuluru, Ramakanth (2017) Ordinal convolutional neural networks for predicting RDoC positive valence psychiatric symptom severity scores. J Biomed Inform 75S:S85-S93

Showing the most recent 10 out of 77 publications

Comments

Be the first to comment on Ozlem Uzuner's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: