Browse Prior Art Database

Channel-Bank-Based Thresholding to Improve Search Time in the Fast-Match

IP.com Disclosure Number: IPCOM000111044D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 110K

Publishing Venue

IBM

Related People

Gopalakrishnan, PS: AUTHOR [+4]

Abstract

An algorithm is disclosed for improving the search time of the fast match, by pruning additional nodes in the fast match tree This is done by using a channel-bank, and thresholds at the output of the different channels. In addition, we introduce the use of some parameters, that can be obtained while training, to reduce the computational effort further.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 50% of the total text.

Channel-Bank-Based Thresholding to Improve Search Time in the Fast-Match

      An algorithm is disclosed for improving the search time of the
fast match, by pruning additional nodes in the fast match tree This
is done by using a channel-bank, and thresholds at the output of the
different channels.  In addition, we introduce the use of some
parameters, that can be obtained while training, to reduce the
computational effort further.

      In the Tangora Speech Recognition System, the decoder computes
an acoustic match score between the spoken word, and some short list
of words from the 20K vocabulary, and uses this score along with
information about the previous spoken words (incorporated through the
language model) to determine a most probable candidate for the spoken
word.  For purposes of the acoustic fast match score computation,
each word is represented as a phone-sequence, with the phone
sequences corresponding to all the words in the 20K vocabulary being
organized in the form of a tree.  The short list of words, for which
the detailed acoustic match score is computed, is provided in this
case by a fast match algorithm, which traverses down the tree
selecting possible candidates from the 20K vocabulary.  The fast
match algorithm is outlined in [*], essentially, as the search
proceeds down the tree, a score is computed for every  node that is
visited, that corresponds to the acoustic match score between the
spoken word and the phone sequence in the tree upto that node.  If
this score falls below a threshold value, the node, and all its
successor nodes, are pruned from the search (the threshold is
determined dynamically based on the score profile of the highest
ranking word encountered till that point of time).  In this
disclosure, a method is described to reduce the number of nodes
visited by the fast match algorithm, by using local information
obtained from a channel-bank, to further prune the tree.  Further,
techniques are also outlined to use phone-dependent parameters,
obtained during training, to reduce the computational effort in the
computation of the acoustic fast match score, and the dynamic
threshold profile.

      a).  Let  f sub i  denote the sequence of input labels ( f sub
i memberof Z , where
 Z  denotes the label-alphabet), and
 i denotes the time index.  Further assume that the HMM's for each
phone are such that the set of output probabilities on all the
branches of a HMM are identical, and that

 m sub u ( f sub i ) denotes the output probability of the label  f
sub i given phone  u , ( u memberof U  where  U  denotes the
phone-alphabet).  A channel-bank is designed to compute the acoustic
match score between each phone and the last  N  labels.  Hence, there
are as many channels as there are phones, and the output of the
 u sup th   channel at time  n  is given by

C sub u (n) = product from k=n-N to n of  m sub u (f sub k ) %%% u
memberof U.

As only the last  N  labels are be...