Browse Prior Art Database

Impulse Response Fast Match

IP.com Disclosure Number: IPCOM000108046D
Original Publication Date: 1992-Apr-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 89K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+5]

Abstract

In large vocabulary speech recognition it is advantageous to carry out a fast match (1,2) to reduce the number of word candidates for which the detailed match must be evaluated.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Impulse Response Fast Match

       In large vocabulary speech recognition it is advantageous
to carry out a fast match (1,2) to reduce the number of word
candidates for which the detailed match must be evaluated.

      The input to the acoustic fast match is the end-time
distribution of the detailed match of the sequence of words on a path
chosen for extension.  In isolated speech recognition, because of the
silences between words, the end-time distributions for many paths are
the same.  For this reason, it is usually sufficient to perform only
one acoustic fast match per word spoken, with the results being
shared by a multiplicity of paths.

      In continuous speech, the end-time distributions for different
paths tend to be very different, and, therefore, the scheme described
above is not very effective.  In this invention a method is described
which allows sharing the acoustic match calculation in a new and
innovative way.

      Let the acoustic output for a sentence be n frames.  We will
perform a fast match starting at each time index t= 0,1,...n,
assuming that the input distribution is an impulse w(t).   Thus, for
each time index t, we obtain an acoustic fast match list (t) which
consists of a set of words that match well, starting at time t.  Each
list  also contains the value of the fast match score for each word
in the list. We refer to these lists L(t) as impulse response fast
match lists.

      Assume we are given an input end-time distribution
(e(a),e(a+1),...e(b)) which starts at time t=a and goes to time t=b.
We can construct the acoustic fast match list for this distribution
by combining the impulse response fast match lists for times t=a,
t=a+1,  ... t=b.  The fast match score for a word in this composite
list is obtained by multiplying its score in L(t) by e(t) and summing
these values in the range t=a to t=b.

      The acoustic fast match scores can then be combined with the
language model scores in the usual way to produce a pruned final
list.  This pruning is normally done based on i) rank in the list,
ii) the value of the score (combined acoustic and language model
score), and i...