Inferring searcher intent is a central problem in information retrieval and web search: for effective ranking and result presentation, the search engine must know what the user is looking for. Yet, expressing a searcher information need currently relies on entering the ?right? search keywords, which can require multiple rounds of trial-and-error from the searcher. The goal of this project is to develop effective methods for a search engine to automatically infer searcher intent and information needs from the searcher interactions and behavior data. Specifically, the project addresses two main challenges of search intent inference: developing accurate and robust models of searcher intent and behavior, and exploiting these models to infer search intent for each individual user. This project significantly advances previous efforts on implicit feedback and search modeling, by considering a wide range of user interaction and contextual features, and by developing novel techniques for mining and exploiting these signals to improve web search and information access.

To develop robust search intent and behavior models, the project uses machine learning and data mining techniques to model the connection between search actions and result page behavior and the searcher intent. The first stage of the project develops and evaluates these models in controlled lab environments, by combining eye tracking and search interface instrumentation data. The second stage of the project empirically validates the intent inference models through a large-scale collection of search behavior data using a variety of remote user studies with instrumented search interfaces. Finally, the project applies the resulting models and algorithms to improve performance on key information retrieval tasks including result ranking, automatic query expansion, and search result presentation.

The techniques developed in this project are expected to make web search and information access more intuitive and effective for millions of users through collaboration with major search engine companies. Additional broader impacts will be achieved through domain-specific applications of the developed techniques, ranging from improved library search to web-based diagnostics of cognitive impairment. All aspects of the project will involve graduate and undergraduate students, and the resulting tools and datasets are to be integrated into undergraduate course instruction and projects, thus broadening participation in computer science research. The resulting publications, software, and datasets will be made publicly available on the project website (http://ir.mathcs.emory.edu/intent/).

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1018321
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2010-09-01
Budget End
2014-08-31
Support Year
Fiscal Year
2010
Total Cost
$500,000
Indirect Cost
Name
Emory University
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30322