A variety of applications are increasingly relying on crowdsourcing services, such as Amazon Mechanical Turk or CrowdFlower, in order to access human computation at a large scale and solve problems that cannot be tackled using only machine computation. Despite the surge in crowdsourcing platforms, it remains very difficult and error-prone to employ crowdsourcing within an application: existing services mostly expose a procedural interface to post individual human-computation tasks, and provide little support (if any) for task coordination, reward management, or clean-up of the obtained answers. As a result, a large fraction of the application logistics is devoted to orchestrating and optimizing the interaction with the crowdsourcing service. This project explores the novel paradigm of declarative crowdsourcing through the development of the Deco database system. Deco models and offers support for accessing the collective knowledge of the crowd by posing declarative queries over a relational-like database. The project explores methods to mitigate the effect of "noisy" human workers who provide data of low quality and to model the resulting uncertainty in the answers returned by Deco. The two problems are tightly coupled with a tradeoff among the latency to contact human workers, the expense to recruit them and the quality of the data they provide. Handling this tradeoff in the context of query optimization is one of the key technical challenges addressed by the project.

The project represents a high-risk research effort, as it targets non-trivial problems that are inherent in the usage of crowdsourcing in practice. The corresponding high payoff is that the results of this research provide a robust and principled foundation for declarative crowdsourcing, thus enabling a wide variety of applications to incorporate crowdsourcing as a core component of their software stack. Moreover, this project identifies desirable features of crowdsourcing services in order to support this novel declarative interface, thereby providing valuable guidance for the design of next-generation crowdsourcing platforms. Finally, the project provides training to students and the opportunity to engage in the emerging research area that lies in the intersection of databases and crowdsourcing. Details for the project can be found at the project web site (http://db.cs.ucsc.edu/deco).

Agency
National Science Foundation (NSF)
Institute
Division of Information and Intelligent Systems (IIS)
Type
Standard Grant (Standard)
Application #
1251827
Program Officer
Maria Zemankova
Project Start
Project End
Budget Start
2012-10-01
Budget End
2014-09-30
Support Year
Fiscal Year
2012
Total Cost
$199,931
Indirect Cost
Name
University of California Santa Cruz
Department
Type
DUNS #
City
Santa Cruz
State
CA
Country
United States
Zip Code
95064