Our goal is to leverage an information fusion approach to integrate structured and unstructured information to generate a longitudinal health record (LHR) for accelerating the pace at which patients can be recruited into clinical trials. Because electronic health records (EHR) contain clinical summaries of a patient's clinical history, one would assume that they could be easily leveraged to automatically screen and identify potentially eligible patients. However most EHRs are not well designed to support screening of eligible patients and are composed of multiple data sources that are often redundant or inconsistent, stored in uncoordinated unstructured clinical narratives and structured data. These characteristics make EHRs difficult to use for matching patients against the complex event and temporal criteria of clinical trials protocols. This research proposes that an improved LHR, which contains a comprehensive clinical summary of a patient, can improve patient screening. We propose using a method of information fusion to generate this LHR, which merges information from multiple data sources, that addresses both the meaning and temporal nature of data, such that the resulting information is more accurate than would be possible if these sources were used individually.
The specific aims are to: 1) characterize the barriers of using EHR sources for screening in terms of data redundancy, inconsistency, lack of structure, and temporal imprecision;2) automatically extract information from unstructured EHR sources necessary for screening patients against clinical trials eligibility criteria using natural language processing;3) developan LHR appropriate for screening patients against eligibility criteria using information fusion methods based on semantic and temporal information;and 4) evaluate the accuracy of an LHR formed through information fusion for screening patients against clinical trials eligibility critera. The respective hypotheses to be tested are: 1) Different parts of the EHR will contain variable amounts of redundancy, inconsistency, and temporal imprecision. Some sources will be more valuable for matching patients than others to clinical trials eligibility criteria. 2) Including th information contained in the unstructured notes will reduce the false positive rate of identifying potentially eligible patients over leveraging only the structured data in the EHR. 3) By using information fusion methods based on leveraging semantic and temporal information on a combination of structured and unstructured data, we will be able to accurately summarize the information contained in uncoordinated EHR data sources into an LHR that can be used for screening patients for clinical trials. 4) The use of information fusion to generate a longitudinal health record will increase the sensitivity and specificity of electronic clinical trial screening ver using a traditional EHR. With an LHR formed through information fusion for screening patients for clinical trials eligibilit, we will be able to not only reduce the amount of staff effort required to recruit a patient into a clinical trial, but also accelerate the pace at which clinical trials can be conducted.

Public Health Relevance

This project is focused on generating a longitudinal health record for accelerating the pace at which patients can be recruited into clinical trials. Accelerating the pace at which patients are recruited into clinical trials has the potential for improving the speed at which new treatments are made available to the public.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM011116-03
Application #
8722624
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
2012-09-01
Project End
2016-08-31
Budget Start
2014-09-01
Budget End
2015-08-31
Support Year
3
Fiscal Year
2014
Total Cost
Indirect Cost
Name
Ohio State University
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
City
Columbus
State
OH
Country
United States
Zip Code
43210
Shivade, Chaitanya; Hebert, Courtney; Regan, Kelly et al. (2016) Automatic data source identification for clinical trial eligibility criteria resolution. AMIA Annu Symp Proc 2016:1149-1158
Griffis, Denis; Shivade, Chaitanya; Fosler-Lussier, Eric et al. (2016) A Quantitative and Qualitative Evaluation of Sentence Boundary Detection for the Clinical Domain. AMIA Jt Summits Transl Sci Proc 2016:88-97
Shivade, Chaitanya; Hebert, Courtney; Lopetegui, Marcelo et al. (2015) Textual inference for eligibility criteria resolution in clinical trials. J Biomed Inform 58 Suppl:S211-8
Shivade, Chaitanya; Malewadkar, Pranav; Fosler-Lussier, Eric et al. (2015) Comparison of UMLS terminologies to identify risk of heart disease using clinical notes. J Biomed Inform 58 Suppl:S103-10
Shivade, Chaitanya; Raghavan, Preethi; Fosler-Lussier, Eric et al. (2014) A review of approaches to identifying patient phenotype cohorts using electronic health records. J Am Med Inform Assoc 21:221-30
Raghavan, Preethi; Chen, James L; Fosler-Lussier, Eric et al. (2014) How essential are unstructured clinical narratives and information fusion to clinical trial recruitment? AMIA Jt Summits Transl Sci Proc 2014:218-23
Ren, Kaiyu; Lai, Albert M; Mukhopadhyay, Aveek et al. (2014) Effectively processing medical term queries on the UMLS Metathesaurus by layered dynamic programming. BMC Med Genomics 7 Suppl 1:S11
Raghavan, Preethi; Fosler-Lussier, Eric; Lai, Albert M (2012) Inter-annotator reliability of medical events, coreferences and temporal relations in clinical narratives by annotators with varying levels of clinical expertise. AMIA Annu Symp Proc 2012:1366-74