Traditional approaches to document retrieval focus on conversion to electronic text followed by indexing of the text content. Recently some work in the community has focused on indexing document image content directly. Such techniques break down when text content is limited or highly degraded. Work on document quality estimation will be extended image quality to address structural quality, a factor that is important for determining if traditional document processing operations will succeed or not. Then,the team will explore the effects of enhancement on classification and retrieval and extend existing work to adapt to changes in quality. The research is motivated by the need for analysts to deal with very large collections of image data. The traditional goal of converting all documents on an electronic form and using traditional text analysis methods fails when dealing with heterogeneous collections and very noisy (possibly multilingual) content. The approach will allow document image retrieval systems to scale to orders of magnitude beyond current capabilities, and permit users to move beyond content features and use structural similarity to explore large collections. This will permit the users to mine large collections for clusters of similar content without knowing a priori specifically what the collection contains through classification. The result will be adaptive techniques that can learn from small numbers of samples without knowledge of sources of degradation.