Browse Prior Art Database

Decision Nets for Context-Dependent Prototypes in a Speech Recognition System

IP.com Disclosure Number: IPCOM000104302D
Original Publication Date: 1993-Apr-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 72K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

Disclosed is a method for constructing a decision network to arrive at context-dependent prototypes for acoustic parameter vectors.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Decision Nets for Context-Dependent Prototypes in a Speech Recognition System

      Disclosed is a method for constructing a decision network to
arrive at context-dependent prototypes for acoustic parameter
vectors.

      In discrete-parameter speech recognition systems, a vector
quantiser outputs an acoustic label at regular intervals.  In one
prominent approach to speech recognition each label is characterized
by a context-dependent "prototype" consisting of a mixture of
diagonal Gaussian distributions.  There are several mixtures per
label, one of which will be selected to assess the likelihood of an
acoustic vector.  Decision trees are constructed that examine the
phonetic context of each vector.  For any frame, the appropriate
mixture is determined from the phonetic context of the corresponding
frame by tracing a path to the leaf of the decision tree.  The
procedure for constructing the decision trees has the following
drawbacks.  The questions used to split the data are simple questions
of the form "Is the phone in position i in the subset S".  In many
cases we would like to ask more complex questions involving more than
one phone.  The decision tree also tends to fragment the data
resulting in similar prototypes at different leaves of the tree.  The
invention described here generalizes the procedure by constructing a
decision network instead of a decision tree, eliminating the
above-mentioned drawbacks.

      The input to the decision network construction algorithm is a
set of acoustic vectors that are aligned against one arc in the
inventory of hidden Markov model arcs.  Each vector is tagged with
the phonetic context in which it was realized.  The decision network
construction goes through the following steps:

1.  Let all vectors be at the root node of the network.

2.  If there are no more nodes to be split, terminate.  Otherwise
    select some node n for splitting.

3.  If the number of samples N sub n at node n is smaller than a
    threshold T sub s then make n a leaf node and go to step 2.
    Otherwise go to the next step.

4.  Compute the evaluation function m(q,n) for all questions q
   ...