Browse Prior Art Database

Constructing Gaussian Seed Clusters to Improve Labelling Accuracy in a Diagonal Gaussian Vector Quantiser for Speech Recognition

IP.com Disclosure Number: IPCOM000106663D
Original Publication Date: 1993-Dec-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 2 page(s) / 62K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+7]

Abstract

In [1] a procedure is given for constructing a vector quantiser for speech recognition purposes. The vector quantiser is based on a mixture of diagonal Gaussian distributions which is created in two stages. First, seed distributions are derived using a Euclidean clustering process. And second, the seed distributions are refined via K-means diagonal Gaussian clustering. In [2-4] various methods of speeding up seed construction are given, but the design principles are essentially the same: find a mixture of diagonal Gaussians which maximises the likelihood of some training data. The accuracy of the resulting labeller does not figure directly in the design of the Gaussian mixture.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Constructing Gaussian Seed Clusters to Improve Labelling Accuracy in a Diagonal Gaussian Vector Quantiser for Speech Recognition

      In [1]  a procedure is given for constructing a vector
quantiser for speech recognition purposes.  The vector quantiser is
based on a mixture of diagonal Gaussian distributions which is
created in two stages.  First, seed distributions are derived using a
Euclidean clustering process.  And second, the seed distributions are
refined via K-means diagonal Gaussian clustering.  In [2-4]  various
methods of speeding up seed construction are given, but the design
principles are essentially the same: find a mixture of diagonal
Gaussians which maximises the likelihood of some training data.  The
accuracy of the resulting labeller does not figure directly in the
design of the Gaussian mixture.

      The purpose of the present invention is to create seed
Gaussians which are intended to improve the ability of the labeller
to discriminate between labels, even if this reduces the likelihood
of the training data.

      Assume that the existence of some acoustic parameter vectors
tagged with the identity of the label intended to be output by the
vector in quantiser; this is called training data.  The following
steps are performed:

1.  Perform Steps 2-3 for each acoustic vector X in turn.

2.  Find the nearest neighbour of X in the training data.  Nearest
    means having minimum Euclidean distance.  Neighbour means that it
    cannot be X itself.

3.  If the tags of X and its nearest neighbour are different, then X
    is assumed to lie on or close to a label boundary, in which case
    tag X is being a boundary vector.

4.  Select a sample of N vectors from the boundary vectors to serve
    as seeds for Euclidean clustering.  A reasonable value for N is
    about 10-20 t...