Browse Prior Art Database

Method of Selecting Prototypes of Mixed Dimensionality from Labelled Data

IP.com Disclosure Number: IPCOM000104610D
Original Publication Date: 1993-May-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 131K

Publishing Venue

IBM

Related People

Das, SK: AUTHOR

Abstract

Substantial quantities of labelled data are often available in many applications. This article relates how a set of prototypes of mixed dimensionality are derived from such data for effective classification. The prototypes are of mixed dimensionality as they can span from one to several frames or include supplementary parameters to enhance their classification power.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 42% of the total text.

Method of Selecting Prototypes of Mixed Dimensionality from Labelled Data

      Substantial quantities of labelled data are often available in
many applications.  This article relates how a set of prototypes of
mixed dimensionality are derived from such data for effective
classification.  The prototypes are of mixed dimensionality as they
can span from one to several frames or include supplementary
parameters to enhance their classification power.

      Substantial quantities of labelled data are often available in
many applications.  In the case of speech recognition, for example,
such data are typically obtained by the following procedure.  The
speech data are represented every centisecond by a vector of
parameters.  Next, a vector quantizing procedure is applied to derive
a set of unsupervised prototypes.  Speech data classified by these
prototypes are used for hidden Markov model training and Viterbi
alignment.  Such alignment provides a label for each centisecond
frame of speech data.

      The current invention provides a method of prototype selection
from such labelled data for effective classification.  It may be
noted that relatively straightforward techniques, such as averaging
the tokens of different classes, picking the first several
occurrences of each class or selecting class tokens uniformly
throughout the available data, usually lead to unsatisfactory
classifier performance.

      Another point to remember is that in applications such as
speech sound classification, it is often desirable to examine a span
of several frames at a time since a single frame may not contain
sufficient discriminatory information.  It may also be helpful to
include additional parameters such as formant trajectory information.
This gives rise to the notion of data of mixed dimensionality.  For
example, if a single frame is represented by a vector of P
parameters, a span of S such frames spliced together constitutes a
vector of dimensionality SP.

      This invention is designed to explore a data span of one to
several frames and include optional parameters for improved
performance.  The final result is a set of prototypes of mixed
dimensionality which are simultaneously used for classification after
appropriate normalization.

      At the outset, the original labelled database is divided into
two independent parts called the test database and the training
database.  The goal is to select prototypes from the second database
to maximize classification of tokens in the first one.  The data are
represented in the space of maximum dimensionality intended for the
implementation.

      Let the test database consist of the vectors T sub i, i = 1,2,
..., N, and their associated labelsr sub i, i = 1,2, , ..., N, where
N is the number of vectors in that database.  Corresponding to these
vectors three other buffers of size N, whose functions are explained
later, are specified, d sub 1 , i = 1,2, ..., N, for running...