Browse Prior Art Database

Correcting Unconditional Statistics in Discrete-Parameter Markov Source Models to Reflect Dependence in the Output Sequence

IP.com Disclosure Number: IPCOM000099918D
Original Publication Date: 1990-Mar-01
Included in the Prior Art Database: 2005-Mar-15
Document File: 2 page(s) / 91K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+3]

Abstract

In discrete-parameter Markov source models, it is assumed that at time t, some arc a(t) produces an observed label f(t). For simplicity, it is often assumed that

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Correcting Unconditional Statistics in Discrete-Parameter Markov Source Models to Reflect Dependence in the Output Sequence

       In discrete-parameter Markov source models, it is assumed
that at time t, some arc a(t) produces an observed label f(t).  For
simplicity, it is often assumed that

      Pr(f(t)   a(t),f(t-1)) = Pr(f(t)   a(t))     (1) which is a
very bad approximation in the case of speech recognition.  In devices
where assumption (1) has been designed into the hardware or software,
the problem cannot easily be fixed retroactively.  In this document,
we leave the faulty design intact, and seek replacement statistics
for the output probabilities Pr(f   a), which attempt to make the
device behave as closely as possible to a correctly-designed device.

      The same approach is adopted in (*), where the statistics are
adjusted on the basis of runs in the label sequence.  In this
document, the statistics are adjusted for more general associations
between labels, whether or not they are manifested in runs.

      We will assume the existence of some training data, a training
script, and some trained Markov model statistics. The following steps
are performed.
Step 1. Let C(i,j) denote the number of times label i was followed by
label j in the training data.  Compute C(i,j) for all labels i, j.
Step 2. Let F(i) denote the number of times that label i occurred in
the training data as the first member of a pair of labels. Compute
F(i) as the sum of the ith row of C.
Step 3. Let S(j) denote the number of times that label j occurred in
the training data as the second member of a pair of labels. Compute
S(j) as the sum of the jth column of C.
Step 4. Estimate the unconditional probability of observing label j
as Q(j) = S(j) / T, where T is the sum of the elements of
S. Step 5. Estimate the probability of observing label j given that
label i has just occurred as P(j   i) = C(i,j) / F(i).
Step 6. Compute the ratio of the conditional and unconditional label
probabilities R(i,j) = P(j   i) / Q(j), for all i,j.
Step 7. Perform Steps 8-9 for each arc h in the Markov model arc
inventory.
Step 8. Let O(j   h) denote the trained (output) probability of
observing label j when arc h is active.  Compute Xh(i,h) = O(j h) x R
(i,j).  Xh(i,j) is an unnor...