Evidence-based medicine (EBM) promises to transform the way that physicianstreat their patients, resulting in better quality and more consistent care informed directlyby the totality of relevant evidence. However, clinicians do not have the time to keep upto date with the vast medical literature. Systematic reviews, which provide rigorous,comprehensive and transparent assessments of the evidence pertaining to specificclinical questions, promise to mitigate this problem by concisely summarizing allpertinent evidence. But producing such reviews has become increasingly burdensome(and hence expensive) due in part to the exponential expansion of the biomedicalliterature base, hampering our ability to provide evidence-based care.If we are to scale EBM to meet the demands imposed by the rapidly growingvolume of published evidence, then we must modernize EBM tools and methods. Morespecifically, if we are to continue generating up-to-date evidence syntheses, then wemust optimize the systematic review process. Toward this end, we propose developingnew methods that combine crowdsourcing and machine learning to facilitateefficient annotation of the full-texts of articles describing clinical trials. Theseannotations will comprise mark-up of sections of text that discuss clinically relevant fieldsof importance in EBM, such as discussion of patient characteristics, interventionsstudied and potential sources of bias. Such annotations would make literature searchand data extraction much easier for systematic reviewers, thus reducing their workloadand freeing more time for them to conduct thoughtful evidence synthesis.This will be the first in-depth exploration of crowdsourcing for EBM. We willcollect annotations from workers with varying levels of expertise and cost, ranging frommedical students to workers recruited via Amazon Mechanical Turk. We will develop andevaluate novel methods of aggregating annotations from such heterogeneous sources.And we will use the acquired manual annotations to train machine learning models thatautomate this mark up process. Models capable of automatically identifying clinicallysalient text snippets in full-text articles describing clinical trials would be broadly usefulfor biomedical literature retrieval tasks and would have impact beyond our immediateapplication of EBM.

Public Health Relevance

We propose to develop crowdsourcing and machine learning methods to annotateclinically important sentences in full-text articles describing clinical trials. Ultimately; weaim to automate such annotation; thereby enabling more efficient practice of evidence-basedmedicine (EBM).

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Exploratory/Developmental Cooperative Agreement Phase I (UH2)
Project #
1UH2CA203711-01
Application #
9272982
Study Section
Special Emphasis Panel (ZRG1-BST-U (50)R)
Program Officer
Miller, David J
Project Start
Project End
Budget Start
2016-08-01
Budget End
2017-07-31
Support Year
1
Fiscal Year
2016
Total Cost
$240,650
Indirect Cost
$44,812
Name
Northeastern University
Department
Type
DUNS #
001423631
City
Boston
State
MA
Country
United States
Zip Code
02115
Nye, Benjamin; Jessy Li, Junyi; Patel, Roma et al. (2018) A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature. Proc Conf Assoc Comput Linguist Meet 2018:197-207
Patel, Roma; Yang, Yinfei; Marshall, Iain et al. (2018) Syntactic Patterns Improve Information Extraction for Medical Search. Proc Conf 2018:371-377
Marshall, Iain J; Noel-Storr, Anna; Kuiper, Joël et al. (2018) Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide. Res Synth Methods 9:602-614
Nguyen, An T; Wallace, Byron C; Li, Junyi Jessy et al. (2017) Aggregating and Predicting Sequence Labels from Crowd Annotations. Proc Conf Assoc Comput Linguist Meet 2017:299-309
Mortensen, Michael L; Adam, Gaelen P; Trikalinos, Thomas A et al. (2017) An exploration of crowdsourcing citation screening for systematic reviews. Res Synth Methods 8:366-386
Wiener, Martin; Sommer, Friedrich T; Ives, Zachary G et al. (2016) Enabling an Open Data Ecosystem for the Neurosciences. Neuron 92:617-621
Wiener, Martin; Sommer, Friedrich T; Ives, Zachary G et al. (2016) Enabling an Open Data Ecosystem for the Neurosciences. Neuron 92:929