Semi-Automating Data Extraction for Systematic Reviews

Wallace, Byron; Marshall, Iain

Abstract

?Semi-Automating Data Extraction for Systematic Reviews (?Renewal) Evidence-based Medicine (EBM) aims to inform patient care using all available evidence. Realizing this aim in practice would require access to concise, comprehensive, and up-to-date structured summaries of the evidence relevant to a particular clinical question. Systematic reviews of biomedical literature aim to provide such summaries, and are a critical component of the EBM arsenal and modern medicine more generally. However, such reviews are extremely laborious to conduct. Furthermore, owing to the rapid expansion of the biomedical literature base, they tend to go out of date quickly as new evidence emerges. These factors hinder the practice of evidence-based care. In this renewal proposal, we seek to continue our ground-breaking efforts on developing, evaluating, and deploying novel machine learning (ML) and natural language processing (NLP) methods to automate or semi-automate the evidence synthesis process. This will extend our innovative and successful efforts developing RobotReviewer and related technologies under the current grant. Concretely, for this renewal we propose to move from extraction of clinically salient data elements from individual trials to synthesis of these elements across trials.
Our first aim i s to extend our ML and NLP models to produce (as one deliverable) a publicly available, continuously and automatically updated semi-structured evidence database, comprising extracted data for all evidence, both published and unpublished. Unpublished trials will be identified via trial registries. Taking this up-to-date evidence repository as a starting point, we then propose cutting-edge ML and NLP models that will generate first drafts of evidence syntheses, automatically. More specifically we propose novel neural cross-document summarization models that will capitalize on the semi-structured information automatically extracted by our existing models, in addition to article texts. These models will be deployed in a new version of RobotReviewer, called RobotReviewerLive, intended to be a prototype for ?living? systematic reviews. To rigorously evaluate the practical utility of the proposed methodological innovations, we will pilot their use to support real, ongoing, exemplar living reviews.

Public Health Relevance

Semi-Automating Data Extraction for Systematic Reviews (?Renewal) Narrative We propose novel machine learning and natural language processing methods that will aid biomedical literature summarization and synthesis, and thereby support the conduct of evidence-based medicine (EBM). The proposed models and technologies will motivate core methodological innovations and support real-time, up-to-date, semi-automated biomedical evidence syntheses (?systematic reviews?). Such approaches are necessary if we are to have any hope of practicing evidence-based care in our era of information overload.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Library of Medicine (NLM)
Type: Research Project (R01)
Project #: 2R01LM012086-05
Application #: 9818711
Study Section: Biomedical Library and Informatics Review Committee (BLR)
Program Officer: Sim, Hua-Chuan

Project Start: 2015-09-20
Project End: 2023-06-30
Budget Start: 2019-09-01
Budget End: 2020-06-30
Support Year: 5
Fiscal Year: 2019
Total Cost
Indirect Cost

Institution

Name: Northeastern University
Department
Type: Schools of Arts and Sciences
DUNS #: 001423631

City: Boston
State: MA
Country: United States
Zip Code: 02115

Related projects


NIH 2020 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Wallace, Byron Casey; Marshall, Iain / Northeastern University
NIH 2019 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Wallace, Byron Casey; Marshall, Iain / Northeastern University
NIH 2018 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Bias, Randolph; Marshall, Iain; Trikalinos, Thomas; Wallace, Byron Casey / Northeastern University
NIH 2017 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Bias, Randolph; Marshall, Iain; Trikalinos, Thomas; Wallace, Byron Casey / Northeastern University
NIH 2016 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Bias, Randolph; Marshall, Iain; Trikalinos, Thomas; Wallace, Byron Casey / Northeastern University
NIH 2015 R01 LM	Semi-Automating Data Extraction for Systematic Reviews Bias, Randolph; Marshall, Iain; Trikalinos, Thomas; Wallace, Byron Casey / University of Texas Austin

Publications

Marshall, Iain J; Noel-Storr, Anna; Kuiper, Joël et al. (2018) Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide. Res Synth Methods 9:602-614

Marshall, Iain J; Kuiper, Joël; Banner, Edward et al. (2017) Automating Biomedical Evidence Synthesis: RobotReviewer. Proc Conf Assoc Comput Linguist Meet 2017:7-12

Singh, Gaurav; Marshall, Iain J; Thomas, James et al. (2017) A Neural Candidate-Selector Architecture for Automatic Structured Clinical Text Annotation. Proc ACM Int Conf Inf Knowl Manag 2017:1519-1528

Wallace, Byron C; Kuiper, Joël; Sharma, Aakash et al. (2016) Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision. J Mach Learn Res 17:

Yu, Zhiguo; Bernstam, Elmer; Cohen, Trevor et al. (2016) Improving the utility of MeSH® terms using the TopicalMeSH representation. J Biomed Inform 61:77-86

Zhang, Ye; Marshall, Iain; Wallace, Byron C (2016) Rationale-Augmented Convolutional Neural Networks for Text Classification. Proc Conf Empir Methods Nat Lang Process 2016:795-804

Comments

Be the first to comment on Byron Wallace's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: