Browse Prior Art Database

Character Recognition Employing Fourier Transformation Based on Run Length Coded Data Format

IP.com Disclosure Number: IPCOM000079214D
Original Publication Date: 1973-May-01
Included in the Prior Art Database: 2005-Feb-26
Document File: 4 page(s) / 77K

Publishing Venue

IBM

Related People

Min, PJ: AUTHOR [+2]

Abstract

The algorithm to be described provides a means for classifying a hand-printed character, which has been extracted from the facsimile image file of a scanned document. The basic implementation of the algorithm involves a series of horizontal and vertical scans, which are made on character data and used as input signals to a Discrete Fourier Transformation. The input to this recognition scheme should be an array which contains the image description of a character in run-length coded form - the array does not contain the binary image of the character. A binary image representation would map each spot in the character image to an element of the array. The element would be "on" (1), or "off" (0), depending on the status of the corresponding spot in the image.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 4

Character Recognition Employing Fourier Transformation Based on Run Length Coded Data Format

The algorithm to be described provides a means for classifying a hand- printed character, which has been extracted from the facsimile image file of a scanned document. The basic implementation of the algorithm involves a series of horizontal and vertical scans, which are made on character data and used as input signals to a Discrete Fourier Transformation. The input to this recognition scheme should be an array which contains the image description of a character in run-length coded form - the array does not contain the binary image of the character. A binary image representation would map each spot in the character image to an element of the array. The element would be "on" (1), or "off" (0), depending on the status of the corresponding spot in the image. Instead, each row of the array corresponds to a scan line of the image and contains only the coordinates of transition points for that scan line. A transition point is defined to be a spot where the image's status changes from "on" to "off" or vice versa.

To facilitate character isolation, a size constraint has been placed on the symbols. Thus, all symbol data can be contained within some predefined window, whose size corresponds to the maximum possible dimensions for a character in the symbol set. It is also assumed that each character has been centered within this window. A typical centered array containing the run-length coded form of the symbol in Fig. 1 is illustrated in Fig. 2. Each row of the array represents a scan line: each column of the array contains transition points. Blanks in the array correspond to "0"s in core.

The next phase of the symbol recognition process is feature extraction. A combination of vertical and horizontal scans is simulated on the character image, by running a series of checks on the data in the symbol array. The exact positioning of the vertical and horizontal scans is determined by the size of character. After the symbol array is passed to the recognition subroutine, measurements are made to determine the vertical and horizontal dimensions of the character. The horizontal and vertical intervals at which features are taken is determined as follows: I(h) = Character Length over Number of Horizontal Scans

I(v) = Character Width over Number of Vert...