Biomedical literature search is the main entry point for an ever-increasing range of information. PubMed/MEDLINE is the most widely used service for this purpose. During the past few years, there have been efforts to incorporate new features into PubMed to facilitate the user search experience (e.g. Related Articles). Although those features have proved to be highly successful, the ways that searchers express their information needs have changed very little. Typically, users type a few words into the search box, and PubMed returns a list of results. When the search results fail to satisfy their interests, they try again and again with little support from the search engine.
We aim to do better than this by improving the interaction between searchers and the PubMed search engine. PubMed query log analysis allows us to approach our objectives by better understanding users information needs and search strategies. Specifically, we are developing two new interactive search assistants: Related Queries (RQ) and PubMed Ads. RQ is a process to help users refine their search by automatically suggesting alternative queries in response to a user input. To this end, we have successfully developed techniques to collect and aggregate raw PubMed logs and to identify related PubMed queries in different user sessions. Some parts of our research on RQ have already been deployed into the live PubMed search engine. Preliminary experimental results show that approximately 8% of the time users clicked on refined queries when available, suggesting promising and positive user feedback. Toward the development of the PubMed Adsa feature that will recommend a small number of articles that are highly relevant and important to the searcherswe have compared and evaluated several existing relevance ranking and query expansion algorithms, traditionally known as useful techniques for boosting retrieval effectiveness. However, in the context of PubMed search, none of the existing techniques itself was able to find noteworthy articles for PubMed Ads. Current work is focused on finding additional evidence such as articles usage history, and building an evidence-based model that can integrate all aspects of an article in a comprehensive manner.? Not only can log analysis help search as a whole, it can play an important role in developing tools for improving search on an individual basis as well. Through our analysis of PubMed query logs, we are able to better understand each individual users information needs. As a result, we could build a unique profile for each PubMed user. This would allow us to provide personalized search recommendations in PubMed. For instance, if a user is identified as a breast cancer researcher, then any news on breast cancer research and treatment in MEDLINE could be suggested. Currently, we are developing computational techniques to characterize individual users based on both the queries a person entered and documents s/he viewed.

Agency
National Institute of Health (NIH)
Institute
National Library of Medicine (NLM)
Type
Intramural Research (Z01)
Project #
1Z01LM200812-01
Application #
7735088
Study Section
Project Start
Project End
Budget Start
Budget End
Support Year
1
Fiscal Year
2008
Total Cost
$293,131
Indirect Cost
Name
National Library of Medicine
Department
Type
DUNS #
City
State
Country
United States
Zip Code
Lu, Zhiyong; Kim, Won; Wilbur, W John (2009) Evaluating relevance ranking strategies for MEDLINE retrieval. J Am Med Inform Assoc 16:32-6
Yanikkerem, Emre; Ozdemir, Meral; Bingol, Hilal et al. (2009) Women's attitudes and expectations regarding gynaecological examination. Midwifery 25:500-8