Browse Prior Art Database

Non Gaussian Distance Measure for Clustering and Labelling Algorithms in Vector-Quantising Speech Recognition Systems

IP.com Disclosure Number: IPCOM000106025D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 61K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+2]

Abstract

The algorithms in [1] are known to produce very successful labels for speech recognition. Broadly, the methods of [1] consist of two steps: K-means clustering of training parameter vectors using a diagonal-Gaussian distance measure, and maximum-likelihood labelling of test parameter vectors using a mixture of diagonal-Gaussians.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Non Gaussian Distance Measure for Clustering and Labelling Algorithms in Vector-Quantising Speech Recognition Systems

      The algorithms in [1]  are known to produce very successful
labels for speech recognition.  Broadly, the methods of [1]  consist
of two steps:  K-means clustering of training parameter vectors using
a diagonal-Gaussian distance measure, and maximum-likelihood
labelling of test parameter vectors using a mixture of
diagonal-Gaussians.

      Good as these methods are, it can be shown experimentally that
the diagonal-Gaussian mixture model leads to unrealistic relative
label likelihoods; typically, the labeller is over-confident.  The
procedure below improves on the procedures of [1]  by using a more
realistic non-Gaussian distance measure, and a more efficient
clustering method strongly related to the Baum-Welch (or
forward-backward) algorithm [1].

      In [1]  the training parameter vectors are subjected to K-means
clustering using a diagonal-Gaussian distance measure.  That is, each
parameter vector is allocated t the nearest prototype, and the
prototype means and variances are then recomputed from the set of
allocated vectors.

We can improve on this in the following three ways:

1.  Instead of allocating each vector entirely to its nearest
    prototype, allocate each vector to all prototypes in proportion
    to the corresponding relative likelihoods.  This is less biassed
    and leads to more efficient use of the training data.

2.  Instead of estimating the relative likelihoods using diagonal
    Gaussian likelihoods, estimate the relative likelihoods from...