Web Search engines have become indispensible for people from all walks of life for locating information on a broad range of topics. Data from user's interactions with the search engines, web pages, and among each other provide a rich source of information for understanding and profiling users. This project develops a novel probabilistic framework and the associated machine learning algorithms for modeling users and their behavior on the web to improve the ranking of results returned in response to user queries.
This project addresses three closely related technical topics in behavioral and social data modeling, covering the underlying theory, algorithm development and evaluation on real-world data: 1) Developing rich models of user interactions with search engines, information sources, with each other that can handle multiple types of relationships among the entities; 2) Exploiting the resulting models of user behavior to improve the ranking of web pages returned in response to a user query; 3) Modeling the temporal dynamics of user behavior to detect and respond tp changes in the social environment of the user or context-dependent changes in the user's behavior. The resulting algorithms will be evaluated using (i) real-world data available in the public domain (ii) real-world data sets from industrial collaborators and (iii) simulated data that preserve the relevant statistical properties of their real-world counterparts.
The project advances the current state of the art in information retrieval, with potentially large impact on web search and related applications. Collaborations with industry leaders in Web search and e-commerce such as Yahoo!, Microsoft, Tencent and Alibaba can potentially lead to significant impact on information retrieval, web search and ranking, recommender systems, and related areas. The project offers enhanced research-based training opportunities for graduate and undergraduate students at Georgia Tech. Additional information about the project can be found at: www.cc.gatech.edu/~zha/socialL2R.html