Web Search engines have become indispensible for people from all walks of life for locating information on a broad range of topics. Data from user's interactions with the search engines, web pages, and among each other provide a rich source of information for understanding and profiling users. This project develops a novel probabilistic framework and the associated machine learning algorithms for modeling users and their behavior on the web to improve the ranking of results returned in response to user queries.

This project addresses three closely related technical topics in behavioral and social data modeling, covering the underlying theory, algorithm development and evaluation on real-world data: 1) Developing rich models of user interactions with search engines, information sources, with each other that can handle multiple types of relationships among the entities; 2) Exploiting the resulting models of user behavior to improve the ranking of web pages returned in response to a user query; 3) Modeling the temporal dynamics of user behavior to detect and respond tp changes in the social environment of the user or context-dependent changes in the user's behavior. The resulting algorithms will be evaluated using (i) real-world data available in the public domain (ii) real-world data sets from industrial collaborators and (iii) simulated data that preserve the relevant statistical properties of their real-world counterparts.

The project advances the current state of the art in information retrieval, with potentially large impact on web search and related applications. Collaborations with industry leaders in Web search and e-commerce such as Yahoo!, Microsoft, Tencent and Alibaba can potentially lead to significant impact on information retrieval, web search and ranking, recommender systems, and related areas. The project offers enhanced research-based training opportunities for graduate and undergraduate students at Georgia Tech. Additional information about the project can be found at: www.cc.gatech.edu/~zha/socialL2R.html

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Application #
1116886
Program Officer
Sylvia Spengler
Project Start
Project End
Budget Start
2011-09-01
Budget End
2015-08-31
Support Year
Fiscal Year
2011
Total Cost
$508,025
Indirect Cost
Name
Georgia Tech Research Corporation
Department
Type
DUNS #
City
Atlanta
State
GA
Country
United States
Zip Code
30332