III: Small: Entity Selection and Ranking for Data-Mining Applications

Terzi, Evimaria

Abstract

Expert-management portals like linkedin.com, odesk.com and guru.com are indicative sites that allow people to advertise their work or set of skills to the broader public. For example, linkedin features more than 120 million members which allows potential employers, collaborators, etc. to discover individuals or groups of individuals with the desired expertise. Similarly, review-management sites like Amazon or Yelp collect large number of reviews about products or services. For example, kindle has more than 30,000 reviews on Amazon. Naturally, users cannot go over all these reviews and are helped significantly by the identification of a small subset of reviews that is sufficiently informative. Finally, as online social and media networks grow in importance as sources of news and other information, there is an urgent need for tools that automatically identify and recommend important nodes of the network, that specific users may need to follow to fully exploit the power of online social media. In each of these scenarios, given a collection of entities (e.g., reviews about a product, experts that declare certain skills, network nodes or edges), the goal is to identify a subset of important entities (e.g., useful reviews, competent experts, influential nodes respectively).

Existing work on recommender systems attempts to identify important entities either by entity ranking or by entity selection. Entity-ranking methods associate a a score with each entity; They ignore the redundancy between the highly-scored entities. Entity-selection methods try to overcome this drawback by evaluating the desirability of a group of entities taken together; They attempt to identify the best subset of entities, while ignoring other subsets of entities that may be equally-good or almost as good as the best subset. Against this background, this project aims to overcome the drawbacks of existing entity selection and entity ranking methods through a synergistic integration of both into a common framework that allows entity-ranking based on entity selection and entity-selection that based on entity ranking. In the resulting framework, the scores of individual entities are determined in part by the number of good groups of entities they can be part of; and good group of entities consist of entities with high scores.

The main challenge addressed by this work is how to explore the solution space of combinatorial problems in order to identify subsets of entities that participate in many good solutions. The resulting new practical methods for exploring the solution space of combinatorial problems find applications related to expert management systems, management of online product reviews, and network analysis (including physical and social networks). The project also offers enhanced opportunities for research-based training of graduate and undergraduate students at Boston University. All of the research results including publications, software, and data will be freely disseminated to the broader research and educational community through the project website at: www.cs.bu.edu/~evimaria/sel-and-ranking.html.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Application #: 1218437
Program Officer: Sylvia Spengler

Project Start
Project End
Budget Start: 2012-09-01
Budget End: 2016-08-31
Support Year
Fiscal Year: 2012
Total Cost: $499,958
Indirect Cost

III: Small: Entity Selection and Ranking for Data-Mining Applications
Terzi, Evimaria
Boston University, Boston, MA, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments