Translation of biomedical research into practice depends in part on the production of quality systematic reviews that synthesize available evidence. Unfortunately, about 20% of reviews are never completed. Of those that reach fruition, the average time to completion may be 2.4 years, with a reported maximum of 9 years. A major bottleneck occurs when teammates screen studies. In the first step, they independently identify provisionally eligible studies by reading the same set of perhaps thousands of titles and abstracts. To date, researchers have used supervised machine learning (ML) methods in an attempt to automate identification of eligible randomized controlled trials (RCTs). However, finding nonrandomized (NR) studies for inclusion in systematic reviews has yet to be addressed. This is an important problem because RCTs may be unlikely or even unethical for some research questions. Hypotheses. It is broadly hypothesized that (a) methods based on natural language processing and ML can be used to automatically identify topically relevant studies with a mix of NR designs eligible for inclusion in systematic reviews;and (b) machine performance can consistently reach current human standards with respect to identifying eligible studies.
Aims. This research has three aims: (1) Compare the language that biomedical researchers use to describe their NR study designs with existing relevant vocabularies. Develop complementary terminologies for overlooked NR study designs to improve coverage of important vocabularies. Develop and validate a standalone terminology to support librarians who add free-text terms to expert searches. (2) Develop and compare procedures based on natural language processing and supervised ML methods to identify provisionally eligible NR studies that are topically relevant from a set of citations, including titles, abstracts, and metadata. Use terms for NR study designs to improve classification. (3) Generalize procedures developed under Aims 1 and 2 to select topically relevant studies with a mix of designs for provisional inclusion in several types of systematic reviews. Use contextual information in segments of full texts tagged for location to enrich feature vectors. Methods. Reference standards will be built from studies in published Cochrane reviews. Features will be extracted from citations and regions of full texts. Additionally, feature vectors will be enriched with terms for designs that researchers use in combination with terms extracted from major vocabularies. Model performance will be compared with respect to several measures, including mean recall and precision, for 10-fold cross-validations and validations on held-out test sets. Significance. The proposed research is significant because it will help support translation of biomedical research to improve human health. Moreover, developing procedures to identify NR studies is essential for the expeditious translation of a very large body of research.

Public Health Relevance

Translation of biomedical research helps to improve public health by delivering the best available evidence to clinicians. This process depends in part on the production of systematic reviews of research. Computerized procedures will be developed to reduce the labor associated with screening nonrandomized studies for inclusion in reviews.

National Institute of Health (NIH)
National Library of Medicine (NLM)
Career Transition Award (K99)
Project #
Application #
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
Project End
Budget Start
Budget End
Support Year
Fiscal Year
Total Cost
Indirect Cost
University of Pittsburgh
Schools of Medicine
United States
Zip Code
Bekhuis, Tanja; Demner-Fushman, Dina; Crowley, Rebecca S (2013) Comparative effectiveness research designs: an analysis of terms and coverage in Medical Subject Headings (MeSH) and Emtree. J Med Libr Assoc 101:92-100
Bekhuis, Tanja; Demner-Fushman, Dina (2012) Screening nonrandomized studies for medical systematic reviews: a comparative study of classifiers. Artif Intell Med 55:197-207
Bekhuis, Tanja; Kreinacke, Marcos; Spallek, Heiko et al. (2011) Using natural language processing to enable in-depth analysis of clinical messages posted to an Internet mailing list: a feasibility study. J Med Internet Res 13:e98