Screening Nonrandomized Studies for Inclusion in Systematic Reviews of Evidence Translation of biomedical research into practice depends in part on the production of quality systematic reviews that synthesize available evidence. Unfortunately, about 20% of reviews are never completed. Of those that reach fruition, the average time to completion may be 2.4 years, with a reported maximum of 9 years. A major bottleneck occurs when teammates screen studies. In the first step, they independently identify provisionally eligible studies by reading the same set of perhaps thousands of titles and abstracts. To date, researchers have used supervised machine learning (ML) methods in an attempt to automate identification of eligible randomized controlled trials (RCTs). However, finding nonrandomized (NR) studies for inclusion in systematic reviews has yet to be addressed. This is an important problem because RCTs may be unlikely or even unethical for some research questions. Hypotheses. It is broadly hypothesized that (a) methods based on natural language processing and ML can be used to automatically identify topically relevant studies with a mix of NR designs eligible for inclusion in systematic reviews;and (b) machine performance can consistently reach current human standards with respect to identifying eligible studies.
Aims. This research has three aims: (1) Compare the language that biomedical researchers use to describe their NR study designs with existing relevant vocabularies. Develop complementary terminologies for overlooked NR study designs to improve coverage of important vocabularies. Develop and validate a standalone terminology to support librarians who add free-text terms to expert searches. (2) Develop and compare procedures based on natural language processing and supervised ML methods to identify provisionally eligible NR studies that are topically relevant from a set of citations, including titles, abstracts, and metadata. Use terms for NR study designs to improve classification. (3) Generalize procedures developed under Aims 1 and 2 to select topically relevant studies with a mix of designs for provisional inclusion in several types of systematic reviews. Use contextual information in segments of full texts tagged for location to enrich feature vectors. Methods. Reference standards will be built from studies in published Cochrane reviews. Features will be extracted from citations and regions of full texts. Additionally, feature vectors will be enriched with terms for designs that researchers use in combination with terms extracted from major vocabularies. Model performance will be compared with respect to several measures, including mean recall and precision, for 10-fold cross-validations and validations on held-out test sets. Significance. The proposed research is significant because it will help support translation of biomedical research to improve human health. Moreover, developing procedures to identify NR studies is essential for the expeditious translation of a very large body of research.

Public Health Relevance

Translation of biomedical research helps to improve public health by delivering the best available evidence to clinicians. This process depends in part on the production of systematic reviews of research. Computerized procedures will be developed to reduce the labor associated with screening nonrandomized studies for inclusion in reviews.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Research Transition Award (R00)
Project #
Application #
Study Section
Special Emphasis Panel (NSS)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pittsburgh
Schools of Medicine
United States
Zip Code
Frazier, John J; Stein, Corey D; Tseytlin, Eugene et al. (2015) Building a gold standard to construct search filters: a case study with biomarkers for oral cancer. J Med Libr Assoc 103:22-30
Bekhuis, Tanja; Tseytlin, Eugene; Mitchell, Kevin J (2015) A Prototype for a Hybrid System to Support Systematic Review Teams: A Case Study of Organ Transplantation. Proceedings (IEEE Int Conf Bioinformatics Biomed) 2015:940-947
Bekhuis, Tanja; Tseytlin, Eugene; Mitchell, Kevin J et al. (2014) Feature engineering and a proposed decision-support system for systematic reviewers of medical evidence. PLoS One 9:e86277
Bekhuis, Tanja; Demner-Fushman, Dina; Crowley, Rebecca S (2013) Comparative effectiveness research designs: an analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree. J Med Libr Assoc 101:92-100
Song, Mei; O'Donnell, Jean A; Bekhuis, Tanja et al. (2013) Are dentists interested in the oral-systemic disease connection? A qualitative study of an online community of 450 practitioners. BMC Oral Health 13:65
Bekhuis, Tanja; Demner-Fushman, Dina (2012) Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artif Intell Med 55:197-207