Browse Prior Art Database

Improved Utilization of Multiple Codebooks in Automatic Handwriting Recognition

IP.com Disclosure Number: IPCOM000108512D
Original Publication Date: 1992-Jun-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 3 page(s) / 124K

Publishing Venue

IBM

Related People

Bellegarda, EJ: AUTHOR [+4]

Abstract

This article studies the importance and relative contributions of selected feature parameters in shaping the recognition rate of an automatic handwriting recognizer. It has been determined that the second order information (curvature) does not contribute as much as the zeroth (position) and first order (angle) information. In addition, multiple codebooks operating on low dimensional spaces achieve better results than a single codebook operating on a high-dimensional space. This results in a significant gain in computation time.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Utilization of Multiple Codebooks in Automatic Handwriting Recognition

       This article studies the importance and relative
contributions of selected feature parameters in shaping the
recognition rate of an automatic handwriting recognizer.  It has been
determined that the second order information (curvature) does not
contribute as much as the zeroth (position) and first order (angle)
information.  In addition, multiple codebooks operating on low
dimensional spaces achieve better results than a single codebook
operating on a high-dimensional space.  This results in a significant
gain in computation time.

      The problem of automatic recognition of handwritten text
produced in either discrete, runon, cursive, or unconstrained mode is
addressed.  The choice of feature parameters is one of the most
important steps in the design of a handwriting recognizer.  It
requires a trade- off between completeness in characterizing
handwriting and the number of feature parameters one is able to
handle efficiently.

      There have been previous attempts to select a set of feature
parameters in such a way as to characterize handwriting produced in
any type of handwriting mode.  There is created, at each equispaced
point in the pen trajectory, a feature vector of size 6 encompassing
(i) the horizontal and vertical incremental changes; (ii) the sine
and cosine of the angle of the tangent to the pen trajectory at this
given point; and (iii) the incremental changes in sine and cosine.
Subsequently, (2H+1) frames (or feature vectors) were concatenated
(spliced) to form one big vector of dimension 6(2H+1).  Corresponding
eigenvalues were computed based on the total covariance matrix and
rotation/projection was subsequently performed to eliminate
redundancy in the data.

      This approach resulted in a codebook derived from 6-D feature
vectors: we shall call it losely the 6-D codebook. The goal is to
separate out this 6-D codebook into multiple codebooks to test a
hypothesis regarding the usefulness of the curvature in the feature
vector.  The procedure consists in (i) studying a 4-D feature vector
made of position and angle information solely; (ii) splitting the
original 6-D feature vector into three 2-D vectors (position, angle,
curvature) each assigned to different codebook; (iii) varying the
weights of each of the three codebooks in (ii) (to determine the
influence of one codebook over the other).

      The results of the experiments were obtained by taking a value
of H=20.

      The experiments were run on an 81-character vocabular...