Browse Prior Art Database

Construction of a Tree for Context Dependent Prototypes in a Speech Recognition System

IP.com Disclosure Number: IPCOM000106615D
Original Publication Date: 1993-Dec-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 4 page(s) / 96K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

This invention presents a method for constructing a decision tree that examines the phonetic context to arrive at context-dependent prototypes for acoustic parameter vectors.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Construction of a Tree for Context Dependent Prototypes in a Speech Recognition System

      This invention presents a method for constructing a decision
tree that examines the phonetic context to arrive at
context-dependent prototypes for acoustic parameter vectors.

      In discrete-parameter speech recognition systems, a vector
quantiser outputs an acoustic label at regular intervals.  In one
prominent approach to speech recognition [1], each label is
characterized by a "prototype" consisting of a mixture of diagonal
Gaussian distributions, and the label output identifies the prototype
which maximizes the likelihood of a corresponding acoustic parameter
vector.  There is one prototype mixture per label, and it does not
depend on the phonetic context of the frame being installed.  The
invention described in [2]  generalizes the concept of Gaussian
mixture prototypes so as to make them context-dependent.  Instead of
having one mixture per label, there are several mixtures per label,
one of which will be selected to assess the likelihood of an acoustic
vector.  Decision trees are constructed that examine the phonetic
context of each vector.  For any frame, the appropriate mixture is
determined from the phonetic context of the corresponding frame by
tracing a path to the leaf of the decision tree.  This disclosure
describes a method for constructing the decision trees to be used in
the scheme described in [2].

      The input to the decision tree construction algorithm is a set
of acoustic vectors that are aligned against one arc in the inventory
of hidden Markov model arcs.  Each vector is tagged with the
phonectic context in which it was realized.  The decision tree
construction is similar to the algorithm presented in (3) and goes
through the following steps:

1.  Let all vectors be at the root node of the tree.

2.  If there are no more nodes to be split, terminate.  Otherwise,
    select some node n for splitting.

3.  If the number of samples N[n]  at node n is smaller than a
    threshold T[s]  then make n a leaf of the tree and go to Step 2.
    Otherwise, go to the next step.

4.  Compute the evaluation function m(q,n) for all questions q
    memberof Q at this node.  Q is a fixed set of questions.  This
    may be the same as the set used in (3).

5.  Select the question q* = arg max[q]  m(q,n).  If m(q*,n) < T[m]
 ...