Browse Prior Art Database

Analysis And Splitting of Potentially Connected Characters

IP.com Disclosure Number: IPCOM000102252D
Original Publication Date: 1990-Nov-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 6 page(s) / 243K

Publishing Venue

IBM

Related People

Narasimha, MS: AUTHOR [+2]

Abstract

A string of numeric handprint characters will often contain characters which have been connected or otherwise made to overlap by the writer. Some special instances of this situation (e.g., double zeros) may be successfully recognized in their connected state. In general, however, it is necessary to detect the point of inadvertent connection and split the characters apart before attempting recognition.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 27% of the total text.

Analysis And Splitting of Potentially Connected Characters

       A string of numeric handprint characters will often
contain characters which have been connected or otherwise made to
overlap by the writer.  Some special instances of this situation
(e.g., double zeros) may be successfully recognized in their
connected state.  In general, however, it is necessary to detect the
point of inadvertent connection and split the characters apart before
attempting recognition.

      Numeric handprint characters may also be fragmented (e.g., a
detached top of a five) or contain dropouts caused by pen skips or
the scanning and thresholding process. Thus, segmentation routines
must be able to logically combine multiple fragments into a component
consisting of a single character.  During this process, physically
disconnected characters may be inadvertently logically combined into
a single component.  After creating each logically connected
component, the segmentation routines attempt to detect if the
component looks "suspiciously" like it may in fact contain multiple
characters.

      If such is the case, the analysis and splitting routines are
invoked.  Analysis and splitting routines therefore must decide among
the three following options: Continued
1.   The logically connected component does contain overlapping or
connected characters.  In this case the point(s) of contact must be
detected and the characters split apart.
2.   The logically connected component contains multiple characters,
but they are not physically connected.  In this case the segments
belonging to each character must be identified.
3.   The logically connected component consists of a single character
only.  In this case no action should be taken.

      Thus, analysis and splitting must first determine which of the
three situations is true.  In the first case a cut path for
partitioning a connected component must then be found.  In the second
case a path which correctly partitions the existing components must
be found.  Solving these problems in a way which is both fast and
reliable is very challenging.

      This invention disclosure describes a technique based upon the
use of vertical edges which has proven to be both fast and reliable.
The process works as follows:

      Step 1:  Build the extended black segment descriptor table.
Each component passed to analysis and splitting is described by an
array of black segment descriptors which identify the row, column,
and length of each horizontal extent of black (i.e., image as opposed
to background) pels. The extended black segment descriptor table
contains additional fields which hold the end column and the edge
identification numbers of the image edges to which the left and right
ends of the segment respectively belong.  These fields are used in
the construction of the vertical edge table.

      Step 2:   Build the vertical edge table. The vertical edge
table is built in a singl...