CVISION Technologies Inc. proposes the development of novel optical character recognition (OCR) algorithms for increasing the independence of visually impaired people and improving their day-to-day life. The project will result in OCR software that will allow for using digital cameras, including cell phone cameras, to read documents, street signs, and other text. The text can then be read back to the user using text-to-speech technology, or sent to a computer for display or printing at a legible size. In either mode, the software will grant the visually impaired user greater independence while, for example, traveling or signing legal forms, without the need for specialized equipment or the assistance of other people. The research will specifically address the problem of accurate OCR for text in low resolution media such as digital video or digital camera images. The low resolution text in images from these devices gives current OCR technology a great deal of difficulty. The central innovation in this work is the development of a bottom-up approach to OCR without first thresholding the image into black & white (i.e. text and background). This is in contrast to conventional OCR systems, which depend upon the existence of a threshold to separate the text from the background, and then apply a top-down approach. The top-down approach breaks down the text in the image into increasingly smaller units (paragraph, line, word etc.), down to the glyph (character) level. This works well when the sampling resolution is topology-preserving, usually higher than about 200 dpi. The proposed bottom-up approach works in the opposite fashion, identifying the smallest units of the text first and building up to larger and larger units. The vague borders between blurred, undersampled letters that are typical of low resolution captured text can be effectively handled by relaxing the need for thresholding the grayscale image followed by the bottom-up approach, whereas they cause conventional OCR algorithms to fail. Another benefit of this method is that it does not require a calibrated scanning system (e.g. uniform lighting, horizontal text) to operate successfully. This will further increase the accuracy of the CVISION system for text in images captured by cell phone and other non-calibrated sources.

Public Health Relevance

The proposed project's relevance to public health is its value to empower the visually impaired to live their lives with a greater deal of independence. For example, imagine being able to take a photograph of an apartment leasing contract that is illegible to a visually impaired person with an ordinary cell phone camera, using the software developed in this project to perform OCR on the image of the contract, and then having the cell phone read the contract with the phone's speaker. ? ? ?

Agency
National Institute of Health (NIH)
Institute
National Eye Institute (NEI)
Type
Small Business Innovation Research Grants (SBIR) - Phase I (R43)
Project #
1R43EY018979-01
Application #
7480848
Study Section
Special Emphasis Panel (ZRG1-BDCN-F (12))
Program Officer
Wujek, Jerome R
Project Start
2008-05-01
Project End
2009-04-30
Budget Start
2008-05-01
Budget End
2009-04-30
Support Year
1
Fiscal Year
2008
Total Cost
$100,000
Indirect Cost
Name
Cvision Technologies, Inc.
Department
Type
DUNS #
100604276
City
Forest Hills
State
NY
Country
United States
Zip Code
11375