Evidence-based medicine (EBM) promises to transform the way that physicians treat their patients, resulting in better quality and more consistent care informed directly by the totality of relevant evidence. However, clinicians do not have the time to keep up to date with the vast medical literature. Systematic reviews, which provide rigorous, comprehensive and transparent assessments of the evidence pertaining to specific clinical questions, promise to mitigate this problem by concisely summarizing all pertinent evidence. But producing such reviews has become increasingly burdensome (and hence expensive) due in part to the exponential expansion of the biomedical literature base, hampering our ability to provide evidence-based care. If we are to scale EBM to meet the demands imposed by the rapidly growing volume of published evidence, then we must modernize EBM tools and methods. More specifically, if we are to continue generating up-to-date evidence syntheses, then we must optimize the systematic review process. Toward this end, we propose developing new methods that combine crowdsourcing and machine learning to facilitate efficient annotation of the full-texts of articles describing clinical trials. These annotations will comprise mark-up of sections of text that discuss clinically relevant fields of importance in EBM, such as discussion of patient characteristics, interventions studied and potential sources of bias. Such annotations would make literature search and data extraction much easier for systematic reviewers, thus reducing their workload and freeing more time for them to conduct thoughtful evidence synthesis. This will be the first in-depth exploration of crowdsourcing for EBM. We will collect annotations from workers with varying levels of expertise and cost, ranging from medical students to workers recruited via Amazon Mechanical Turk. We will develop and evaluate novel methods of aggregating annotations from such heterogeneous sources. And we will use the acquired manual annotations to train machine learning models that automate this markup process. Models capable of automatically identifying clinically salient text snippets in full-text articles describing clinical trials would be broadly useful for biomedical literature retrieval tasks and would have impact beyond our immediate application of EBM.

Public Health Relevance

We propose to develop crowdsourcing and machine learning methods to annotate clinically important sentences in full-text articles describing clinical trials Ultimately, we aim to automate such annotation, thereby enabling more efficient practice of evidence- based medicine (EBM).

Agency
National Institute of Health (NIH)
Institute
National Cancer Institute (NCI)
Type
Exploratory/Developmental Cooperative Agreement Phase I (UH2)
Project #
5UH2CA203711-02
Application #
9275458
Study Section
Special Emphasis Panel (ZRG1)
Program Officer
Miller, David J
Project Start
2016-08-01
Project End
2018-07-31
Budget Start
2017-08-01
Budget End
2018-07-31
Support Year
2
Fiscal Year
2017
Total Cost
Indirect Cost
Name
Northeastern University
Department
Type
Schools of Arts and Sciences
DUNS #
001423631
City
Boston
State
MA
Country
United States
Zip Code
02115
Nye, Benjamin; Jessy Li, Junyi; Patel, Roma et al. (2018) A Corpus with Multi-Level Annotations of Patients, Interventions and Outcomes to Support Language Processing for Medical Literature. Proc Conf Assoc Comput Linguist Meet 2018:197-207
Patel, Roma; Yang, Yinfei; Marshall, Iain et al. (2018) Syntactic Patterns Improve Information Extraction for Medical Search. Proc Conf 2018:371-377
Marshall, Iain J; Noel-Storr, Anna; Kuiper, Joël et al. (2018) Machine learning for identifying Randomized Controlled Trials: An evaluation and practitioner's guide. Res Synth Methods 9:602-614
Nguyen, An T; Wallace, Byron C; Li, Junyi Jessy et al. (2017) Aggregating and Predicting Sequence Labels from Crowd Annotations. Proc Conf Assoc Comput Linguist Meet 2017:299-309
Mortensen, Michael L; Adam, Gaelen P; Trikalinos, Thomas A et al. (2017) An exploration of crowdsourcing citation screening for systematic reviews. Res Synth Methods 8:366-386
Wiener, Martin; Sommer, Friedrich T; Ives, Zachary G et al. (2016) Enabling an Open Data Ecosystem for the Neurosciences. Neuron 92:617-621
Wiener, Martin; Sommer, Friedrich T; Ives, Zachary G et al. (2016) Enabling an Open Data Ecosystem for the Neurosciences. Neuron 92:929