Enabling access to printed text for blind people via assisted mobile OCR

Manduchi, Roberto

Abstract

This application proposes new technology development and user studies aiming to facilitate the use of mobile Optical Character Recognition (OCR) for blind people. Mobile OCR systems, implemented as smartphones apps, have recently appeared on the market. This technology unleashes the power of modern computer vision algorithms to enable a blind person to hear (via synthetic speech) the content of printed text imaged by the smartphone's camera. Unlike traditional OCR, that requires scanning of a document with a flatbed scanner, mobile OCR apps enable access to text anywhere, anytime. Using their own smartphones, blind people can read store receipts, menus, flyers, business cards, utility bills, and many other printed documents of the type normally encountered in everyday life. Unfortunately, current mobile OCR systems suffer from a chicken-and-egg problem, which limits their usability. They require the user to take a well-framed snapshot of the document to be scanned, with the full text in view, and at a close enough distance that each character can be well resolved and thus readable by the machine. However, taking a good picture of a document is difficult without sight, and thus without the ability to look at the scene being imaged by the camera through the smartphone's screen. Anecdotal evidence, supported by results of preliminary studies conducted by the principal investigator's group, confirms that acquisition of an OCR-readable image of a document can indeed by very challenging for some blind users. We plan to address this problem by developing and testing a new technique of assisted mobile OCR. As the user aims the camera at the document, the system analyzes in real time the stream of images acquired by the camera, and determines how the camera position and orientation should be adjusted so that an OCR-readable image of the document can be acquired. This information is conveyed to the user via a specially designed acoustic signal. This acoustic feedback allows users to quickly adjust and reorient the camera or the document, resulting in reduced access time and in more satisfactory user experience. Multiple user studies with blind participants are planned with the purpose of selecting an appropriate acoustic interface and of evaluating the effectiveness of the proposed assisted mobile OCR modality.

Public Health Relevance

This application is concerned with the development of new technology designed to facilitate use of mobile Optical Character Recognition (OCR) systems to access printed text without sight. Specifically, this exploratory research will develop and test a novel system that, by means of a specially designed acoustic interface, will help a blind person take a well-framed, well-resolved image of a document for OCR processing using a smartphone or wearable camera. If successful, this novel approach to assisted mobile OCR will reduce access time and improve user experience of blind mobile OCR users.

Funding Agency

Agency: National Institute of Health (NIH)
Institute: National Eye Institute (NEI)
Type: Exploratory/Developmental Grants (R21)
Project #: 1R21EY025077-01
Application #: 8812658
Study Section: Special Emphasis Panel (ZRG1-ETTN-P (02))
Program Officer: Wiggs, Cheri

Project Start: 2015-01-01
Project End: 2016-12-31
Budget Start: 2015-01-01
Budget End: 2015-12-31
Support Year: 1
Fiscal Year: 2015
Total Cost: $191,510
Indirect Cost: $66,510

Institution

Name: University of California Santa Cruz
Department: Engineering (All Types)
Type: Schools of Engineering
DUNS #: 125084723

City: Santa Cruz
State: CA
Country: United States
Zip Code: 95064

Related projects


NIH 2016 R21 EY	Enabling access to printed text for blind people via assisted mobile OCR Manduchi, Roberto / University of California Santa Cruz	$207,506
NIH 2015 R21 EY	Enabling access to printed text for blind people via assisted mobile OCR Manduchi, Roberto / University of California Santa Cruz	$191,510

Publications

Qin, Siyang; Manduchi, Roberto (2017) Cascaded Segmentation-Detection Networks for Word-Level Text Spotting. Proc Int Conf Doc Anal Recognit 2017:1275-1282

Cutter, Michael; Manduchi, Roberto (2017) Improving the Accessibility of Mobile OCR Apps Via Interactive Modalities. ACM Trans Access Comput 10:

Qin, Siyang; Manduchi, Roberto (2016) A Fast and Robust Text Spotter. Proc IEEE Workshop Appl Comput Vis 2016:

Cutter, Michael; Manduchi, Roberto (2015) Towards Mobile OCR: How To Take a Good Picture of a Document Without Sight. Proc ACM Symp Doc Eng 2015:75-84

Comments

Be the first to comment on Roberto Manduchi's grant

Recent in Grantomics:

Recently viewed grants:

Recently added grants: