EAGER: Document Image Quality Estimation, Enhancement, Classification and Retrieval

Davis, Larry; Doermann, David

Abstract

Traditional approaches to document retrieval focus on conversion to electronic text followed by indexing of the text content. Recently some work in the community has focused on indexing document image content directly. Such techniques break down when text content is limited or highly degraded. Work on document quality estimation will be extended image quality to address structural quality, a factor that is important for determining if traditional document processing operations will succeed or not. Then,the team will explore the effects of enhancement on classification and retrieval and extend existing work to adapt to changes in quality. The research is motivated by the need for analysts to deal with very large collections of image data. The traditional goal of converting all documents on an electronic form and using traditional text analysis methods fails when dealing with heterogeneous collections and very noisy (possibly multilingual) content. The approach will allow document image retrieval systems to scale to orders of magnitude beyond current capabilities, and permit users to move beyond content features and use structural similarity to explore large collections. This will permit the users to mine large collections for clusters of similar content without knowing a priori specifically what the collection contains through classification. The result will be adaptive techniques that can learn from small numbers of samples without knowledge of sources of degradation.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Information and Intelligent Systems (IIS)
Type: Standard Grant (Standard)
Application #: 1359902
Program Officer: Sylvia Spengler

Project Start
Project End
Budget Start: 2013-10-01
Budget End: 2015-09-30
Support Year
Fiscal Year: 2013
Total Cost: $234,225
Indirect Cost

EAGER: Document Image Quality Estimation, Enhancement, Classification and Retrieval
Davis, Larry Doermann, David
University of Maryland College Park, College Park, MD, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments