The goal of this project is to develop a domain-independent method to search multimedia databases by content, e.g., to find video clips that are similar to a news broadcasting session in a collection of video clips. The approach extracts N numerical features from each object, effectively mapping the object into a point in N-dimensional space. Then, highly fine-tuned, off-the-shelf, spatial access methods (like R-trees) are used to cluster the data and to search very quickly for similar objects, that is, nearby points in the N-d space. This approach is generic and can be used for any database of multimedia objects, as long as appropriate features can be derived. The specific applications in this project are: (a) natural and medical images, with state-of-the-art signal processing methods, like wavelets and morphology, and (b) voice and related time sequences that include time-warping and Hidden Markov Models (HMM). Applications of such a system are numerous, including computer aided medical diagnosis and teaching (by using similarity search on older, expertly diagnosed medical images), queries on video by content (in art, entertainment, or journalism applications), and similarity searching in financial and scientific time sequences (for data mining, decision support, and forecasting).