Browse Prior Art Database

Handwriting Detector for Recognition Machines

IP.com Disclosure Number: IPCOM000079015D
Original Publication Date: 1973-Apr-01
Included in the Prior Art Database: 2005-Feb-26
Document File: 1 page(s) / 12K

Publishing Venue

IBM

Related People

Baumgartner, RJ: AUTHOR

Abstract

In some recognition systems, it is necessary to separate documents containing machine-printed characters from those containing handwritten printing or script, so that the latter documents may be set aside for special handling. An example of such a system, is a mail reader where hand-addressed mail pieces may remain in stacks of supposedly machine-printed mail. The technique to be described may also be used in an automatic enricher, for culling handwritten documents or mailpieces from intermixed batches before the batches are fed into a recognition system.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 71% of the total text.

Page 1 of 1

Handwriting Detector for Recognition Machines

In some recognition systems, it is necessary to separate documents containing machine-printed characters from those containing handwritten printing or script, so that the latter documents may be set aside for special handling. An example of such a system, is a mail reader where hand-addressed mail pieces may remain in stacks of supposedly machine-printed mail. The technique to be described may also be used in an automatic enricher, for culling handwritten documents or mailpieces from intermixed batches before the batches are fed into a recognition system.

The presence of handwriting on a document is detected by measuring the distribution of character heights on the document lines. The raw video from each document line is first separated into individual characters by conventional segmentation methods. The height of each character is measured, and a tally is kept of the number of characters in each line whose heights lie within each of a plurality of overlapping 16-mil ranges. The ranges may be, for instance, 56-72 mils, 64-80 mils, 72-88 mils, 80-96 mils, etc., up to 232-248 mils. The document is then classified as handwritten unless at least one line meets both of the following criteria: (1) the line has more than 12 characters; and (2) at least half of the characters on that line have heights which lie within a single one of the height ranges.

The above decision rests upon the observation that machine-printed charact...