Browse Prior Art Database

Procedure for Automatic Script Determination for Continuous Speech Recognition Systems

IP.com Disclosure Number: IPCOM000100294D
Original Publication Date: 1990-Mar-01
Included in the Prior Art Database: 2005-Mar-15
Document File: 2 page(s) / 66K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+5]

Abstract

Because words can have multiple pronunciations (called lexemes), the phone context of a given phone P cannot be determined uniquely from knowledge of the word sequence in the neighborhood of P. For this reason, it is convenient to work in units of lexemes instead of words when recognizing continuous speech. It is more convenient for speakers, however, if their training scripts are written in terms of words rather than lexemes. Thus, after a script has been recorded, it is necessary to determine which lexemes were actually uttered, and, in addition, where the speaker paused.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Procedure for Automatic Script Determination for Continuous Speech Recognition Systems

       Because words can have multiple pronunciations (called
lexemes), the phone context of a given phone P cannot be determined
uniquely from knowledge of the word sequence in the neighborhood of
P.  For this reason, it is convenient to work in units of lexemes
instead of words when recognizing continuous speech.  It is more
convenient for speakers, however, if their training scripts are
written in terms of words rather than lexemes.  Thus, after a script
has been recorded, it is necessary to determine which lexemes were
actually uttered, and, in addition, where the speaker paused.

      This invention determines a detailed lexeme script
automatically, and is an improvement over (1) in that it is more
accurate, and does not require a special-purpose language model to
handle pauses during recognition.

      The following steps are performed:
Step  1. Construct a Markov model for each lexeme in the vocabulary.
Step  2. For each word W in the vocabulary, create a Markov word
model for W from the lexemes of W by linking together all the lexeme
models in parallel.
Step  3. Append to each word model a deleteable Markov model to
represent silence (a pause).
step  4. Obtain trained statistics for the constructed word models
using the forward-backward algorithm (2).
Step  5. Remove from each word model, the silence model appended in
Step 3.
Step  6. Create a separate word...