This is an inter-institutional collaborative project involving Ramesh Jain at the University of California, San Diego and Raghu Ramakrishnan at the University of Wisconsin, Madison. The research focuses on content-based querying of large collections of images stored in a general-purpose (i.e., application domain independent) database. The goal is to develop scalable methods for dealing with very large numbers of images by exploiting similarities in groups of images, e.g., a group of frontal face images. The approach taken is to provide the user with the ability to describe a collection of images in sufficient detail to allow the system to tailor feature extraction, indexing, etc., to suppo rt a rich class of content-based queries. Descriptions are in an "image data definition language" and are specified in terms of image characteristics such as color, shape, etc. The system architecture is designed to be extensible in that user-provided feature extraction algorithms can be easily incorporated, and efficient high-dimensional indexing techniques are utilized. While there have been many efforts to build systems that can effectively retrieve images from a given image domain, a general purpose system poses many difficult problems. This research is expected to greatly increase the expressiveness of content-based queries that can be efficiently supported in general image databases, making possible novel applications involving large numbers of images from diverse domains.