Browse Prior Art Database

Writer-Independent Automatic Handwriting Recognition Using a Continuous Parameter Mixture Density Hidden Markov Model

IP.com Disclosure Number: IPCOM000104571D
Original Publication Date: 1993-May-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 87K

Publishing Venue

IBM

Related People

Bellegarda, EJ: AUTHOR [+4]

Abstract

A new strategy for the automatic recognition of on-line handwritten text is described. It is based on a left to right hidden Markov model (HMM) that models the dynamics of the written script. A mixture of Gaussian distributions is used to represent the output probabilities at each arc (transition) of the HMM. Different tying alternatives to train the mixture coefficients are investigated. Experimental results are presented in the writer-independent case because it offers more training data by pooling data from several writers.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Writer-Independent Automatic Handwriting Recognition Using a Continuous Parameter Mixture Density Hidden Markov Model

      A new strategy for  the  automatic  recognition  of  on-line
handwritten  text  is  described.  It  is based on a left to right
hidden Markov model (HMM) that models the dynamics  of the  written
script.  A mixture of Gaussian distributions is used to represent
the  output  probabilities  at  each  arc (transition)  of  the  HMM.
Different tying alternatives to train   the   mixture   coefficients
are    investigated.  Experimental results are presented in the
writer-independent case  because  it  offers more training data by
pooling data from several writers.

      A statistical  mixture model was developed in which a mixture
of diagonal Gaussian densities was defined to represent each
character in some alphabet of interest.  The model was naturally
extended by considering explicit left to right   hidden Markov
models  (baseforms), which  more adequately model the intra-character
variations.  The experiments reported previously  were
writer-dependent; i.e.  the models were trained on samples from a
given writer and the decoding was performed on independent samples
taken from  the  same writer.  In what is reported here, the models
are trained based on samples pooled from a set of 8 writers.  The
decoding is performed on independent samples taken  from a set of 4
different writers.

      The different writing styles of each character (lexemes) were
identified and the training data separated accordingly.  These
lexemes may differ in shape, direction of pen  movements, or number
of strokes.  Each individual lexeme is represented in terms of an HMM
depicted below.

      The states of the model are labeled by s sub 1,  ...  s  sub
L.  Associated  with  each  state  is a set of transitions denoted in
the sketch by t sub 1,  t  sub  2, <and>  t  sub 3  that govern the
sequence of states.  The transitions labeled s sub 1, <and> s sub 2
results in the emission of a feature vector while that labeled t sub
3 is a null transition and results  in  no  output.  The number of
states, the state transition probabilities, and the output
probability distributions completely specify the model.

      In the present implementation, the output probabilities at each
arc of the HMM are determined from a mixture of Gaussian
distributions.  The mi...