Browse Prior Art Database

Optical Character Recognition Test

IP.com Disclosure Number: IPCOM000104109D
Original Publication Date: 1993-Mar-01
Included in the Prior Art Database: 2005-Mar-18
Document File: 2 page(s) / 48K

Publishing Venue

IBM

Related People

Ett, AH: AUTHOR

Abstract

A method is developed for the printing of multi-font stress samples for use in the testing of Optical Character Recognition (OCR) engines. This method includes the ability to generate random character sequences, and to provide randomized errors in the positioning of the characters in both the horizontal and vertical domains while producing output samples of uniform quality on a laser printer, and data files for error analysis.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 78% of the total text.

Optical Character Recognition Test

      A method is developed for the printing of multi-font stress
samples for use in the testing of Optical Character Recognition (OCR)
engines.  This method includes the ability to generate random
character sequences, and to provide randomized errors in the
positioning of the characters in both the horizontal and vertical
domains while producing output samples of uniform quality on a laser
printer, and data files for error analysis.

      In the development of Optical Character Recognition system it
is usually necessary to generate statistically significant quantities
of test data samples, using a wide variety of type styles and sizes,
and designed to stress test the OCR engine.  These samples include
items in which the characters are misregistered from nominal
locations in both the vertical and horizontal domains, and consist of
unrelated random sequences to separate out contextual processing
methods.  Manual typewriters with substantial unorthodox manipulation
were required to generate such samples.  Analysis of recognition
results required visual comparison samples and results.

      This invention is a method of injecting into the PPDS data
stream of IBM laser printers, or the data stream of other laser
printers, control sequences which reposition the cursor in the
printing process from its nominal location to a randomized region
around the nominal location on a character by character basis.  The
invention then auto...