Browse Prior Art Database

Segmentation Algorithm for Dot Printed Characters

IP.com Disclosure Number: IPCOM000034729D
Original Publication Date: 1989-Apr-01
Included in the Prior Art Database: 2005-Jan-27
Document File: 2 page(s) / 49K

Publishing Venue

IBM

Related People

Mano, T: AUTHOR [+2]

Abstract

Disclosed is a segmentation algorithm of an optical character recognition system for segmenting or breaking up images of roughly printed characters into separate, distinct images of each character. The roughly printed character means a character printed by separate or discontinuous black dots. Run-end data representing the discontinuous black dots in the character is modified to recognize the roughly printed character as a character drawn by continuous black dots. The figure shows examples of the roughly printed characters, i.e., numerals 7 and 8, which are printed by discontinuous black dots. The roughly printed numerals 7 and 8 are segmented from each other in the following manner. The images of the numerals 7 and 8 are stored in an image memory, (not shown). The images are scanned from top to bottom.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 60% of the total text.

Page 1 of 2

Segmentation Algorithm for Dot Printed Characters

Disclosed is a segmentation algorithm of an optical character recognition system for segmenting or breaking up images of roughly printed characters into separate, distinct images of each character. The roughly printed character means a character printed by separate or discontinuous black dots. Run-end data representing the discontinuous black dots in the character is modified to recognize the roughly printed character as a character drawn by continuous black dots. The figure shows examples of the roughly printed characters, i.e., numerals 7 and 8, which are printed by discontinuous black dots. The roughly printed numerals 7 and 8 are segmented from each other in the following manner. The images of the numerals 7 and 8 are stored in an image memory, (not shown). The images are scanned from top to bottom. Distribution of the black dots sampled by a scan line at a position 11 is represented by black areas 12-17. The algorithm detects changing points of the color in the scan line. The values A1, A2, A3, A4, A5 and A6 represent addresses of the changing points from white to black. The values B1, B2, B3, B4, B5 and B6 represent addresses of the changing points from black to white. The black areas are represented by the data (A1, B1), (A2, B2), (A3, B3), (A4, B4), (A5, B5) and (A6, B6). The data is called as the run-end data. The algorithm creates modified data MD1, MD2, MD3, MD4, MD5 and MD6 by adding a constant value C...