SBIR Phase I: Real-time, accurate OCR from Video using Intra- and Inter-Frame Machine Learning

Gross, Ari

Abstract

This Small Business Innovation Research (SBIR) Phase I research project focusses on the development of ground-breaking real-time algorithms for automatically finding and recognizing text in digital video of complex 3-D environments using machine learning of fonts and text strings. Essentially, the project takes OCR from being a technology for 2-D documents and brings it to the 3-D world. The project builds on algorithms for optical character recognition (OCR) of documents where conventional OCR fails: colorful brochures, magazine covers, and other sources where photographs, line art, and arbitrarily-rotated text greatly complicate the OCR process. The project aims to build on this technology to find solutions to the finding and recognizing text in complex 3-D real world scenes such as street signs and storefronts where the text may be at any arbitrary 3-D angle to the camera. Critical to the success of this project is the algorithm's capability for machine learning of fonts.

There are a number of exciting applications that are impacted by accurate OCR from video sources. While OCR of text in video sources can be done, it usually must be on plainly obvious text, such as subtitles, and it cannot be done in real-time. Real-time and accurate video OCR would enable applications that include 1) Unaided indexing of digital video footage by the text contained therein, 2) aiding the blind navigate independently, both indoors and outdoors, 3) automated continuous roadside or vehicle based license plate scanning, and 4) as ground truth for improved GPS accuracy. Markets for the technology therefore include individuals, corporations, and government agencies. The societal impacts include 1) rendering digitized video libraries searchable by more metadata tags at low cost, 2) greater independence and safety for the blind, 3) improving road safety through automatically identifying cars reported stolen or cars owned by people with suspended licenses, and 4) improved GPS navigation accuracy. Technological impacts will be in the areas of machine learning applied to video OCR, real-time OCR, and low-resolution OCR.

Funding Agency

Agency: National Science Foundation (NSF)
Institute: Division of Industrial Innovation and Partnerships (IIP)
Type: Standard Grant (Standard)
Application #: 0810693
Program Officer: Ian M. Bennett

Project Start
Project End
Budget Start: 2008-07-01
Budget End: 2008-12-31
Support Year
Fiscal Year: 2008
Total Cost: $100,000
Indirect Cost

SBIR Phase I: Real-time, accurate OCR from Video using Intra- and Inter-Frame Machine Learning
Gross, Ari
Cvision Technologies, Inc., Forest Hills, NY, United States

Abstract

Funding Agency

Institution

Comments

Recent in Grantomics:

Recently viewed grants:

Recently added grants:

Abstract

Funding Agency

Institution

Comments