Browse Prior Art Database

Discrete Parameter System for Automatic Handwriting Recognition

IP.com Disclosure Number: IPCOM000111214D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 110K

Publishing Venue

IBM

Related People

Bellegrada, EJ: AUTHOR [+4]

Abstract

A discrete parametrization is developed for a hidden Markov model approach to the automatic recognition of on-line handwriting. Comparisons with previously developed continuous parameter system shows that the decoding speed is improved by up to a factor of five, which opens the door to a real-time implementation.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Discrete Parameter System for Automatic Handwriting Recognition

      A discrete parametrization is developed for a hidden Markov
model approach to the automatic recognition of on-line handwriting.
Comparisons with previously developed continuous parameter system
shows that the decoding speed is improved by up to a factor of five,
which opens the door to a real-time implementation.

      An automatic handwriting recognition system is considered such
as described in prior art, where each representative realization
(referred to as allograph, or lexeme) of each character is
represented by a hidden Markov model.  Each of these Markov character
models, or baseforms, is constructed as a sequence of elementary
units.  In prior art, these units are two-node, three arc machines
which are automatically derived given some chirographic distributions
in an alphabet A sub cp resulting from a suitably large inventory of
chirographic prototypes.  Two ways of obtaining these prototypes are
(i) through unsupervised K-means clustering, or (ii) through
supervised bottom-up clustering.

      The supervised approach of (ii) has the advantage of producing
a single label alphabet from which to draw all potential elementary
units for all writers.  This makes it possible, by exposing the
explicit relationship between a character model and its manifestation
in chirographic space, to compare across writers the base forms
generated for a given character.  Another consequence, which was not
exploited in (ii), is that it defines a canonical partition of the
chirographic space into high-level regions corresponding to each of
the labels.  This article takes advantage of this high level
quantization to derive a discrete parametrization for the hidden
Markov models.

      In a continuous parameter system, the baseform associated with
each allograph is represented by a sequence of states and a set of
transition probabilities and output distributions.  In the mixture
model defined in \hmm, the mixture coefficients are the transition
probabilities and the continuous (Gaussian) probability density
functions specify the output probability distributions.

      In a discrete parameter approach, each baseform will be
expressed as a sequence of discrete symbols, called fenones,
resulting from the high level vector quantization of the chirographic
space discussed above.  The current feature vector (or frame) is
replaced by its vector-quantized label and each continuous
(Gausssian) distribution is replaced by a discrete probability
function.

      Handwriting is captured for some adequate number of writers N.
We assume that all handwriting data has been pooled together and
appropriately signal processed into a (number of) sequence(s)...