Browse Prior Art Database

Carpet correction method for multifont and handwritten characters

IP.com Disclosure Number: IPCOM000033944D
Original Publication Date: 2005-Jan-06
Included in the Prior Art Database: 2005-Jan-06

Publishing Venue

IBM

Abstract

United States Patent 5455875 shows "carpet correction" method for the verification of OCR results. However, in case that different shapes of characters, such as handwritten and machine printed, are intermingled in a screen, the verification process is still tough even if the carpet correction is applied. This disclosure shows an enhancement of the above carpet correction. This can separate handwritten and machine printed characters, and can also sort the result by character code order. This enhanced method carried out easy-to-verify environment for OCR verification operation.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Carpet correction method for multifont and handwritten characters

[Background]

United States Patent 5455875, "System and method for correction of optical character recognition with display of image segments according to character data," [1] is a useful invention in order to correct the recognition results of OCR systems. It is called as a carpet correction method, and is an error correction method for a lot of OCR forms.

The carpet correction uses visual characteristics of human beings that a small amount of different shape image can be easily discovered while lots of almost same shape image are displayed.

However, when different shapes of characters, such as handwritten and machine printed, are intermingled in a screen, the verification operation is still tough even if they have the same character code.

In this case, verification operators will waste time in verifying recognition results, even if they use the carpet correction method. Thus much more effective correction method has been expected.

[Details of invention]

There are many forms on which handwritten characters and machine printed characters are filled in same fields. For example, as shown in figure 1, an ID number field is filled with machine printed characters for existing customers while the same field is still blank for new customers. A person in charge enters a new ID number by handwriting in this field for this case.

In this kind of OCR forms that ID fields are filled with printed characters and handwritten characters, operators are required to verify them much carefully even if the recognition results are sorted in character code order.

Figure 1. Example of mixed field (Handwritten characters and machine printed characters)

Figure 2 shows a conventional carpet correction method for this kind of field data,

1

[This page contains 2 pictures or other non-text objects]

Page 2 of 3

including characters "2" and "3." This figure shows that distinction of these characters becomes more difficult since printed and handwritten are intermingled.

Figure 2. Example of conventinal carpet correction

On the other hand, figure 3 is a sample of new carpet correction method using "font type" discrimination (discrimination of handwritten and machine printed characters) as secondary sort key in addition to the character code sorting. Figure 3 is more legible for operators since it is sorted by the font type and handwritten order while figure 2 and 3 contain the same data.

Figure 3. Example of e...