Browse Prior Art Database

Speed Improvements in Vector Quantization via Hierarchical Labelling

IP.com Disclosure Number: IPCOM000106748D
Original Publication Date: 1993-Dec-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 2 page(s) / 94K

Publishing Venue

IBM

Related People

Bellegarda, J: AUTHOR [+6]

Abstract

One major component of a speech recognition system is the vector quantizer. The function of a vector quantizer is to take an input signal processing vector and find the closest prototype from a set of predefined prototypes lbracket 4 rbracket . The distance between a vector and a prototype is usually computed using either a Euclidean or Gaussian metric; the time it takes to determine the closest prototype is thus proportional to the product of the dimension of the input vector and the number of prototypes. For a vector quantizer with many prototypes, the amount of computation can become substantial. It is therefore important to develop a fast algorithm for choosing the closest prototype.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Speed Improvements in Vector Quantization via Hierarchical Labelling

      One major component of a speech recognition system is the
vector quantizer.  The function of a vector quantizer is to take an
input signal processing vector and find the closest prototype from a
set of predefined prototypes  lbracket 4 rbracket .  The distance
between a vector and a prototype is usually computed using either a
Euclidean or Gaussian metric; the time it takes to determine the
closest prototype is thus proportional to the product of the
dimension of the input vector and the number of prototypes.  For a
vector quantizer with many prototypes, the amount of computation can
become substantial.  It is therefore important to develop a fast
algorithm for choosing the closest prototype.

      The idea behind this invention is to hierachically cluster a
large number of prototypes (called ordinary prototypes) into a small
number, referred to as the hierarchical prototypes.  During
labelling, the input vector is compared to each of the hierarchical
prototypes, and the closest is found.  Corresponding to each
hierarchical prototype is a set of ordinary prototypes.  The input
vector is then compared to this set of ordinary prototypes, and the
closest is chosen.  With a judicious choice of hierarchical
prototypes, it can be seen that the computational savings can be
substantial.

      The following is a detailed description of the algorithmic
procedure used to construct the hierarchical prototypes.  The current
IBM speech recognition system employs a vector quantizer with 2000 50
dimensional prototypes and a Gaussian distance measure [1], though
this invention is not limited to vector quantizers designed by this
process.  The vector quantizer is designed from large number of
vectors, say 200,000.  Each dimension corresponds to projecting
189-dimensional vector onto a set of discriminating eigenvectors
lbracket 3 rbracket ; the dimensions are ordered by the eigenvalue of
the corresponding eignevector from largest to smallest.

1.  The 2000 50 dimensional prototypes are converted to 2000 20
    dimensional prototypes by discarding the bottom 30 dimensions.

2.  The 2000 20 dimensional prototypes are "bottom-up" clustered [2]
    into 100 20 dimensional prototypes:

    a.  The Euclidean distance between all pairs of prototypes are
        found.

    b.  Step 1:  The two closest prototypes are found.

    c.  A new prototype is formed by computing the average of the
        means of the two closest prototypes weighted by their
        respective counts.

    d.  A new count is formed for the new prototype by summing the
        counts of the two prototypes.

    e.  The two prototypes are discarded, but a pointer is kept to an
        array identifying all of the original prototypes that were
        used in computing the new prototype.

    f.  If the number of prototypes remai...