Browse Prior Art Database

Similarity Measure of Hidden Markov Models

IP.com Disclosure Number: IPCOM000122649D
Original Publication Date: 1991-Dec-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 4 page(s) / 150K

Publishing Venue

IBM

Related People

Emam, OS: AUTHOR

Abstract

A new metric is disclosed for speech recognition tasks, which merges similar triphone contexts on finding a basic phonetic-unit. Computation is done directly from model parameters without reference to training or test data.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 38% of the total text.

Similarity Measure of Hidden Markov Models

      A new metric is disclosed for speech recognition tasks,
which merges similar triphone contexts on finding a basic
phonetic-unit.  Computation is done directly from model parameters
without reference to training or test data.

      A problem associated with Hidden Markov Models (HMMs) is the
measure of similarity.  Given two models M1 and M2, what is the
measure of similarity between them?  Many criteria could be used to
determine similarity between two HMMs.  In the literature, there are
several proposed similarity measures including output string/symbol
probability and maximum mutual information.  This article first
describes some known distance-metric and then introduces a new
similarity measure between two HMMs that have been used in speech
recognition work.

      The First Measure (1) is based on the idea of an average
probability with respect to M2, of an observation O coming from M1 .
If O is the phenomena modelled by M1, then P(O/M2) indicates whether
it is probable that M2 produces the same observation.  By summing all
possible observations of the phenomena modelled by M1, O, the measure
obtained is:
      D(M1,M2) =  S   P(O/M1) P(O/M2)
                  O
This first measure computes the similarity between machines on all
possible observations but could be based on only a subset.

      The Second Measure (2) is of how well model M1 matches
observations generated by model M2, relative to how well model M2
matches observations generated by itself.  Thus given n observation
sequences, O(2)i, i=1, 2,...n, generated by M2, the distance measure
is given by:
      D(M1,M2) = 1/n  S |   log P(O(2)i/M1) - log P(O(2)i/M2) |
                      i
To make the distance measure symmetric, it should be computed as
D(M1,M2)+D(M2,M1).

      The Third Measure (3) is an information theoretic measure that
determines the similarity between two HMMs based on the amount of
information lost when the two models are merged as in the work
reported in (3).  Lee used the weighted entropy of the original and
merged HMMs to measure the information lost.  Weights are calculated
using the forward-backward counts obtained during training of the
models.

      By ignoring the transition probabilities, entropy of an HMM is
defined as the bits of information in the output probability
distributions.  Let:

      N1,d(i) be the count for codeword i in distribution d of model
M1 as determined by the forward-backward algorithm.
      N1,d =  S   N1,d(i)
              i
To normalize these counts into output probabilities:
                   N1,d(i)
      P1,d(i) =    -------
                    N1,d
The entropy of an output probability distribution d for a model M1 is
defined as:
      H1,d =  -  S   P1,d(i)  .  log (P1,d(i))
              ...