Primary care physicians (PCPs) are responsible for reviewing and understanding a wide spectrum of a patient's medical history in order to make informed decisions regarding care. However, a variety of factors impede this process, including: the increasing complexity and number of diagnostic tests and treatments, health information exchange standards that may add more information to the medical record, and the need to efficiently see more patients in less time. These obstructions can lead to an inhibition of dialogue between patients and providers, and possibly even medical errors. New methods are required to help expedite a healthcare provider's understanding of a patient's medical history, summarizing key information. The use of topic models for summarizing large, unstructured data collections is a growing area of research. However, to date little work has been done on adapting these models to the clinical reporting environment. This proposal seeks to develop a topic model and ensuing visualization system for automatically summarizing medical records to support PCPs.
Two specific aims guide the proposed work: 1) to create a topic model of free-text clinical documents that integrates contextual patient- and document-level data, and discovers multi-word concepts; and 2) to utilize the proposed model to drive a web application that includes concept-, source-, and time-oriented views for automatically summarizing patient records. The proposed model's innovation is that it is uniquely adapted to clinical records by the incorporation of demographic and discrete data (e.g., lab results), which influences the discovery of topics in documents and allows for adaptation to each patient's specific history. As a test bed for this project, we will gather medical records coded with myocardial infarction (MI), breast cancer, or liver cirrhosis, as these patients will span a spectrum of clinical complexity. We estimate that 68,539 patient records will be included in this study. The developed topic model will be integrated into a web-based visualization that displays clinically pertinent topics over time, as well as other relevant clinical data. This visualization will be evaluated by PCPs to gauge its utility to support the review of medical histories. This R21 proposal breaks new ground in the use of topic models for clinical data, and will provide future avenues of research in new applications of the proposed model.

Public Health Relevance

Primary care physicians are critical to the task of patient care. Underpinning this task is the time-intensive pro- cess of understanding complex interactions between past medical conditions and treatments, and current prob- lems for each patient. The focus of this research proposal is the development of an automatic summarization system to expedite the review of a patient's medical history. Through future studies, such a system may enable increased patient-provider dialogue and improved clinical workflow.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Exploratory/Developmental Grants (R21)
Project #
5R21LM011937-02
Application #
8919947
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
2014-09-01
Project End
2017-08-31
Budget Start
2015-09-01
Budget End
2017-08-31
Support Year
2
Fiscal Year
2015
Total Cost
Indirect Cost
Name
University of California Los Angeles
Department
Radiation-Diagnostic/Oncology
Type
Schools of Medicine
DUNS #
092530369
City
Los Angeles
State
CA
Country
United States
Zip Code
90095
Speier, William; Ong, Michael K; Arnold, Corey W (2016) Using phrases and document metadata to improve topic modeling of clinical reports. J Biomed Inform 61:260-6
Arnold, Corey W; Oh, Andrea; Chen, Shawn et al. (2016) Evaluating topic model interpretability from a primary care physician perspective. Comput Methods Programs Biomed 124:67-75