1) Biomedical literature search is the main entry point for an ever-increasing range of information. PubMed/MEDLINE is the most widely used service for this purpose. However, finding citations relevant to a users information need is not always easy in PubMed. Improving our understanding of the growing population of PubMed users, their information needs and the way in which they meet these needs opens opportunities to improve information services and information access provided by PubMed. One resource for understanding and characterizing patrons of search engines is the transaction logs. Our previous investigation of user query logs has led us to develop and deploy a useful application in assisting user query formulation in PubMed, namely Related Queries (RQ). Inspired by its success, we have continued using log analysis to identify research problems which are closely related to PubMed operations. 2) We have studied the logs to see how from the history of access to a given document, one can improve the estimate of its likelihood of being accessed in the future. We find that 300 days of history can give useful information about future access of a document, but the value of information decreases over time until it is near zero at 300 days. Estimates of the likelihood of click-through based on this log analysis are currently in the testing stage in the Solr search engine environment being investigated by IEB. We are currently collecting data on click-through events to full text as a more robust way of estimating the value of documents to users. We plan to pursue the same strategy of gathering data from the user logs to make predictions about future click-through events to full text.
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A et al. (2011) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 39:D38-51 |
Mork, James G; Bodenreider, Olivier; Demner-Fushman, Dina et al. (2010) Extracting Rx information from clinical narrative. J Am Med Inform Assoc 17:536-9 |
Sayers, Eric W; Barrett, Tanya; Benson, Dennis A et al. (2010) Database resources of the National Center for Biotechnology Information. Nucleic Acids Res 38:D5-16 |
Lu, Zhiyong; Kim, Won; Wilbur, W John (2009) Evaluating relevance ranking strategies for MEDLINE retrieval. J Am Med Inform Assoc 16:32-6 |
Islamaj Dogan, Rezarta; Murray, G Craig; Neveol, Aurelie et al. (2009) Understanding PubMed user search behavior through log analysis. Database (Oxford) 2009:bap018 |
Lu, Zhiyong; Wilbur, W John (2009) Improving accuracy for identifying related PubMed queries by an integrated approach. J Biomed Inform 42:831-8 |
Lu, Zhiyong; Wilbur, W John; McEntyre, Johanna R et al. (2009) Finding query suggestions for PubMed. AMIA Annu Symp Proc 2009:396-400 |