This project develops techniques to eliminate partial-duplicates in image search from a large scale database. The research team explores spatial representation schemes for partial-duplicate image retrieval in efficient ways. Images are represented by the Bag-of-Visual-Features model, in the similar way to text document represented by a set of text words. A novel scheme, spatial coding, is designed to encode the spatial relationships among local features in an image. Based on the spatial codes of images, verification of the initial matching of local features between images is performed. And those false matches are identified and removed effectively and efficiently. As a result, the similarity between images based on local feature matching is determined more accurately. Consequently, the retrieval performance is greatly improved. The approach enjoys the merit of scalability. It is an initial step towards billion-scale partial-duplicate image retrieval. The project is developing a real time partial-duplicate image search system with sound recall on 10 million web image database.
The research of spatial coding addresses critical problems in multimedia visual information retrieval. The proposed spatial coding approaches have broad impact in various research fields and applications, such as image retrieval, image categorization and object recognition. Finally, research and education are integrated by providing research opportunity for graduate and undergraduate students to selecttheir research topics and senior projects.
In this project, we propose a novel spatial coding scheme for large-scale partial duplicate image search. Spatial context plays a key role in visual identification. How to represent spatial context of local features in images to facilitate image comparison is significant to advance large-scale image search. Our proposed spatial coding scheme encodes the relative spatial relationships of local features into binary maps. Consequently, the problem of geometric verification is converted to comparison of binary spatial coding maps, which is very efficient in implementation. Our verification is performed in a global manner, and can achieve better retrieval accuracy compared with state-of-the-art approaches. On the other hand, since the main computation of the spatial coding scheme is addition and exclusive-OR operation, our computational complexity is low. We have evaluated our approach with promising performance on a 10-million full-size image dataset, which is the largest in the academic community. Further, beyond our spatial coding work, we find that by subtly exploiting the geometric clues of local features, we can adjust the feature coordinates and achieve rotation-invariance adaptively. Our spatial coding scheme can be regarded as an efficient post geometric verification algorithm. Beyond that, it can also be applied to discover repeatable visual patterns with semantics by exploring extensive images. Moreover, since our spatial coding scheme achieves high image search precision, the retrieval results can be naturally mined for image annotation. Further, its extension to video search is straightforward.