This proposal is to develop a general representation framework that uses similarity to capture relationships in large scale image collections. The representation is not restricted to any specific distance function, feature, or learning model. It includes new methods to combine multiple kernels based on different cues, learn low-rank kernels, and improve indexing efficiency. In addition, new methods for nearest neighbor search and semi-supervised learning are proposed. It has relevance to machine learning and computer vision research agendas. Two major research problems addressed are: (1) defining and computing similarities between images' in vast, expanding, repositories, and representing those similarities in an efficient manner so the right pairs can be retrieved on demand; and (2) developing a system that can learn and predict similarities with 'sparse supervisory information and constantly evolving data.' The approach is notable in its embrace of the scale of web archives and its use of verbal and visual means of analysis.