This proposal aims to develop a new computing paradigm to build more effective cloud computing schemes for web-scale multimedia search and learning. It considers the need for algorithmic and systematic design, and aims to break down the gap between fast searching requirement and the burden of processing high dimensional multimedia features. It is well-known that loading and computing high dimensional data are both expensive procedures. The proposed paradigm employs data summary for small trunks and uses those summaries to estimate the lower bound and upper bound for searching measures. Based on these bounds, this paradigm can filter out a lot of data samples before loading them, and thus can reduce the transmission and computation overhead. The new paradigm generalizes Google?s MapReduce computing paradigm for the task of searching high dimensional data, and fits better the applications of processing multimedia data than the general-purpose computing paradigm.
The intellectual merit of this proposal is to exploit the computing resources offered by cloud computing and to develop novel algorithms to perform the multimedia data search in a distributed and efficient manner. The PI?s ambition of making cloud computing suitable for high-dimensional numerical data, if successful, will revolutionize the future of cloud computing, and have a tremendous impact on society at large. The challenges of the problems and its potential payoff and impact, if successful, make this proposal ideally suited for the EAGER program.
The goal of this project is to develop new computing paradigm for next generation multimedia searching. The problem of efficiently searching multimedia document has posed a serious challenge for large scale datasets in the past five years, mainly due to the difficulties in processing high dimensional numerical features. The outcome of our research includes: We developed a novel large scale visual recognition system and was ranked 1st in the Image-net challenge (www.image-net.org/challenges/LSVRC/2010/), the largest image classification competition. Our method works well and obtains 52.9% classification accuracy and 71.8% top-5 hitting rate for millions of images with one thousand categories. We proposed a new large scale image indexing method which could retrieve billion scale image dataset in less than 30 microseconds. We also consider the problem of measuring "similarity" of objects in large graph and social networks. We aim explore the rich resource of the network structure. Intuitively, in a movie sharing network, a video might be interesting to a user if he likes another movie that is similar. In this work, we aim to design an efficient algorithm named Delta-SimRank on distributed computing system. In the best case, we get up to 30 times speed-up compared on distributed MapReduce system. Our research can be used for the task of geographical image understanding. We study millions of personal photos in Flickr, which are associated with user tags and geographical information. We consider two interesting applications of mining geographical image database: first we can recommend tourism trajectory based on geo-tagged photos. Second we can discovery geographical topics, by combining geographical clustering and text topic mining into one framework. Our method can be used to compare the distribution of different topics including food, landscape, and products.