Browse Prior Art Database

Run-Length Coding of Label Sequences and Baseforms to Speed Up Match Computations

IP.com Disclosure Number: IPCOM000112914D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 89K

Publishing Venue

IBM

Related People

Gopalakrishnan, PS: AUTHOR [+3]

Abstract

Disclosed is a method for compressing the label sequences that are input to acoustic match modules as well as the baseform in order to speed up the computation of the match scores.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Run-Length Coding of Label Sequences and Baseforms to Speed Up Match
Computations

      Disclosed is a method for compressing the label sequences that
are input to acoustic match modules as well as the baseform in order
to speed up the computation of the match scores.

      In discrete-parameter speech recognition systems, a vector
quantiser outputs an acoustic label at regular intervals.  These
labels are passed to a fast match and detailed acoustic match stage
which compute the probability of the acoustic label string given the
models for the hypothesized word sequence.  The computation of this
probability score for each word is done using the forward-pass
algorithm [1] and usually involves several thousand operations.  This
may be an inordinate amount of computation to be done in real time on
a personal computer which does not include any special hardware to
assist speech recognition.  This invention describes one method for
reducing the number of operations involved in the match computations
substantially without unduly sacrificing recognition accuracy.  A
similar techniques can also be used to reduce the length of the
baseforms used in computing the scores in the detailed match.
Details of this technique are also presented below.

      Run-length compression of labels - The label stream that is
input to the matches using run-length coding is compressed as
follows.  The probability of the acoustics given a word model is
calculated using the forward pass algorithm.  During the execution of
this algorithm, if a sequence of adjoining frames have the same
label, then compute the forward pass value for only the last time
frame in this sequence of identical labels.

      In particular, let l sub 1 l sub 2 ellipsis l sub n be a
sequence of labels and l sub i l sub <i+1> ellipsis l sub i+R be
identical.  Require R to be less than some maximum allowable
run-length to make sure that the approximations introduced by the
run-length compression does not hurt the decoding accuracy.  A good
choice for this maximum is 4.  While doing the forward pass
calculation, in the above instance, we skip time frames i to i+R-1.
For the time frame i+R, replace all output probabilities by their R
sup th power and carry on the computation as usual.  The transition
probabilities also have to be modified.  The exact modifications
applied to the transition probabilities are described in the next
section.  In order to ensure that the computations in the rest of the
parts of the decoder are consistent, we copy the results from time
frame i-1 to i+R-1 without change....