Screening Nonrandomized Studies for Inclusion in Systematic Reviews of Evidence Translation of biomedical research into practice depends in part on the production of quality systematic reviews that synthesize available evidence. Unfortunately, about 20% of reviews are never completed. Of those that reach fruition, the average time to completion may be 2.4 years, with a reported maximum of 9 years. A major bottleneck occurs when teammates screen studies. In the first step, they independently identify provisionally eligible studies by reading the same set of perhaps thousands of titles and abstracts. To date, researchers have used supervised machine learning (ML) methods in an attempt to automate identification of eligible randomized controlled trials (RCTs). However, finding nonrandomized (NR) studies for inclusion in systematic reviews has yet to be addressed. This is an important problem because RCTs may be unlikely or even unethical for some research questions. Hypotheses. It is broadly hypothesized that (a) methods based on natural language processing and ML can be used to automatically identify topically relevant studies with a mix of NR designs eligible for inclusion in systematic reviews;and (b) machine performance can consistently reach current human standards with respect to identifying eligible studies.
Aims. This research has three aims: (1) Compare the language that biomedical researchers use to describe their NR study designs with existing relevant vocabularies. Develop complementary terminologies for overlooked NR study designs to improve coverage of important vocabularies. Develop and validate a standalone terminology to support librarians who add free-text terms to expert searches. (2) Develop and compare procedures based on natural language processing and supervised ML methods to identify provisionally eligible NR studies that are topically relevant from a set of citations, including titles, abstracts, and metadata. Use terms for NR study designs to improve classification. (3) Generalize procedures developed under Aims 1 and 2 to select topically relevant studies with a mix of designs for provisional inclusion in several types of systematic reviews. Use contextual information in segments of full texts tagged for location to enrich feature vectors. Methods. Reference standards will be built from studies in published Cochrane reviews. Features will be extracted from citations and regions of full texts. Additionally, feature vectors will be enriched with terms for designs that researchers use in combination with terms extracted from major vocabularies. Model performance will be compared with respect to several measures, including mean recall and precision, for 10-fold cross-validations and validations on held-out test sets. Significance. The proposed research is significant because it will help support translation of biomedical research to improve human health. Moreover, developing procedures to identify NR studies is essential for the expeditious translation of a very large body of research.

Public Health Relevance

Translation of biomedical research helps to improve public health by delivering the best available evidence to clinicians. This process depends in part on the production of systematic reviews of research. Computerized procedures will be developed to reduce the labor associated with screening nonrandomized studies for inclusion in reviews.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Transition Award (R00)
Project #
4R00LM010943-02
Application #
8471822
Study Section
Special Emphasis Panel (NSS)
Program Officer
Sim, Hua-Chuan
Project Start
2012-07-01
Project End
2015-06-30
Budget Start
2012-07-01
Budget End
2013-06-30
Support Year
2
Fiscal Year
2012
Total Cost
$224,100
Indirect Cost
$75,720
Name
University of Pittsburgh
Department
Miscellaneous
Type
Schools of Medicine
DUNS #
004514360
City
Pittsburgh
State
PA
Country
United States
Zip Code
15213
Frazier, John J; Stein, Corey D; Tseytlin, Eugene et al. (2015) Building a gold standard to construct search filters: a case study with biomarkers for oral cancer. J Med Libr Assoc 103:22-30
Bekhuis, Tanja; Tseytlin, Eugene; Mitchell, Kevin J (2015) A Prototype for a Hybrid System to Support Systematic Review Teams: A Case Study of Organ Transplantation. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2015:940-947
Bekhuis, Tanja; Tseytlin, Eugene; Mitchell, Kevin J et al. (2014) Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence. PLoS One 9:e86277
Song, Mei; O'Donnell, Jean A; Bekhuis, Tanja et al. (2013) Are dentists interested in the oral-systemic disease connection? A qualitative study of an online community of 450 practitioners. BMC Oral Health 13:65
Bekhuis, Tanja; Demner-Fushman, Dina; Crowley, Rebecca S (2013) Comparative effectiveness research designs: an analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree. J Med Libr Assoc 101:92-100
Bekhuis, Tanja; Demner-Fushman, Dina (2012) Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artif Intell Med 55:197-207