? ? The work proposed in this new investigator initiated project studies the hypothesis that machine learning-based text classification techniques can add significant efficiencies to the process of updating systematic reviews (SRs). Because new information constantly becomes available, medicine is constantly changing, and SRs must undergo periodic updates in order to correctly represent the best available medical knowledge at a given time. ? ? To support studying this hypothesis, the work proposed here will undertake four specific aims: ? 1. Refinement and further development of text classification algorithms optimized for use in classifying ? literature for the update of systematic reviews on a variety of therapeutic domains. Comparative analysis using several different machine learning techniques and strategies will be studied, as well as various means of representing the journal articles as feature vectors input to the process. ? 2. Identification and evaluation of systematic review expert preferences and trade offs between high recall and high precision classification systems. There are several opportunities for including this technology in the process of creating SRs. Each of these applications has separate and unique precision and recall tradeoff thresholds that will be studied based on the benefit to systematic reviews. ? 3. Prospective evaluation of text classification algorithms. We will verify that our approach performs as ? expected on future data. ? 4. Development of comprehensive gold standard test and training sets to motivate and evaluate the ? proposed and future work in this area. ? ? The long term relevance of this research to public health is that automated document classification will ? enable more efficient use of expert resources to create systematic reviews. This will increase both the ? number and quality of reviews for a given level of public support. Since up-to-date systematic reviews are essential for establishing widespread high quality practice standards and guidelines, the overall public health will benefit from this work. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Research Project (R01)
Project #
5R01LM009501-02
Application #
7468470
Study Section
Biomedical Library and Informatics Review Committee (BLR)
Program Officer
Sim, Hua-Chuan
Project Start
2007-07-15
Project End
2010-07-14
Budget Start
2008-07-15
Budget End
2009-07-14
Support Year
2
Fiscal Year
2008
Total Cost
$286,582
Indirect Cost
Name
Oregon Health and Science University
Department
Biostatistics & Other Math Sci
Type
Schools of Medicine
DUNS #
096997515
City
Portland
State
OR
Country
United States
Zip Code
97239
Cohen, Aaron M; Ambert, Kyle; McDonagh, Marian (2012) Studying the potential impact of automated document classification on scheduling a systematic review update. BMC Med Inform Decis Mak 12:33
Cohen, Aaron M; Ambert, Kyle; McDonagh, Marian (2009) Cross-topic learning for work prioritization in systematic review creation and update. J Am Med Inform Assoc 16:690-704
Yang, Jianji J; Cohen, Aaron M; Cohen, Aaron et al. (2008) SYRIAC: The systematic review information automated collection system a data warehouse for facilitating automated biomedical text classification. AMIA Annu Symp Proc :825-9
Cohen, Aaron M (2008) Optimizing feature representation for automated systematic review work prioritization. AMIA Annu Symp Proc :121-5