This Small Business Innovation Research Phase I project concentrates on the opportunities that exist in organizing consumers' opinions, unstructured information that makes up a significant portion of Internet content. Organizing this information will lead to efficiency gains in current markets and will enable emerging ones. This proposal integrates prediction markets and casual games to generate a structured, non ad-hoc and dynamic index of consumers' true opinions on a large scale. Prediction markets are explicitly designed mechanisms where consumers are given incentives to reveal their opinions truthfully through trading games. However, historically prediction markets have not scaled. Casual games are informal problem-solving mechanisms that have been shown to scale massively through Internet and mobile devices. However, casual games are ad hoc and do not induce a coordinated purposeful dataset. The intellectual merit of the proposed research lies in integrating these two mechanisms and applying the result to the commercial enterprise.
The broader impacts of this research are the ability to (1) provision the collective's opinions on a larger scale by lowering barriers for mass participation in a complex mechanism, that in turn (2) decrease uncertainty and increase confidence in the quality of the information, (3) create greater efficiency in current decision-making processes, (4) enable new markets to emerge given the reduced information asymmetries, and (5) have spill-over benefits to many industrial sectors. The integrative approach is a novel contribution to software design methodologies for emerging social computing platforms, where the architecture is increasingly based on participation and less on monolithic designs. The proposed research will contribute to development of the internal logic of the prediction market itself, through better reward structure and lower transaction costs. If successfully deployed, the approach will enable lower-cost engagement for online research geared toward assessing consumer engagement and trends.
Intellectual merits Personalizing a service requires knowing preferences of the service consumer. In domains where goods are well-structured (inventory is standardized, stationary and substitutable across vendors) and there are many opportunities for consumers to interact with the goods, then preferences can be revealed implicitly in consumer’s selection action. Books, music and movies sold by vendors with scale (such as Amazon, Apple and Netflix) are instances of such goods. Collaborative recommendation techniques can be used in such domains to reason about which goods match to which users. However, in domains where goods are unstructured (inventory is not standardized, non-stationary/perishable and not substitutable across vendors) and there are few opportunities for consumers to interact with the goods, then there are few opportunities for consumers to reveal their preferences through selection actions. Deals, news and events are examples of such goods. Under these circumstances explicit revelation of preferences by the consumer can be relatively more optimal. Our goal was to capture and model users’ demographics, interests and expertise so as to match them to other consumers and inventory of unstructured items. We assume interests and expertise are expressed in natural language. To reason about preferences expressed in natural language is hard. For instance, what is the similarity relationship between the expressed preferences "Obama", "hope" and "cheesecake"? This is a trivial task for humans, but to determine this relationship in software is none-trivial given the complexity of language and its usage. What about the demographic characteristics of the individual themselves? Deriving demographic relationships in software, although simpler than linguistic ones, are nonetheless non-trivial. And how do we combine each variable of interest? Our innovation is an attempt to design data models, algorithms and APIs that begin to understand relationships between individuals and objects along demographic and linguistic dimensions. To do so we use open public knowledge bases such as the US Census and Wikipedia encyclopedia. The primary methods used belong to Information Retrieval paradigm. Broader Impacts The innovation carried out in this research makes a contribution along both academic and application dimensions. The algorithms developed, specifically the Natural Language Processing algorithms, extends and refines the "Explicit Semantic Analysis" work of Gabrilovich and Markovitch (2007). We show how collective wisdom of the crowd (in the form of Wikipedia encyclopedia articles) can be used as a corpus for algorithms that can model the relationship between words. We also make contribution to how the noise in the wisdom of the crowd can be reduced. Our innovation also makes contributions to the marketplace. In particular, our innovation can be applied to any vertical whose current supply needs to be better matched to actual/predicted demand. In particular, if service providers could better predict consumer demand then they can better optimize their business models.To achieve this we propose to give end consumers the ability to express their preferences for the goods. This way demand expressed by individual end consumers can be aggregated and fed upstream to either wholesale or retail markets, who then compose personalized offers at some level of aggregation. Evgeniy Gabrilovich and Shaul Markovitch. Computing semantic relatedness using wikipedia-based explicit semantic analysis. In IJCAI, pages 1606-1611, 2007