Browse Prior Art Database

Optimizing the Votes in a Polling Speech Recognition System

IP.com Disclosure Number: IPCOM000102170D
Original Publication Date: 1990-Nov-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 2 page(s) / 75K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+4]

Abstract

This article describes a vote-training algorithm for a speech recognizer which adjusts a given set of polling votes so as to optimize the results achieved by the polling approach.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Optimizing the Votes in a Polling Speech Recognition System

       This article describes a vote-training algorithm for a
speech recognizer which adjusts a given set of polling votes so as to
optimize the results achieved by the polling approach.

      In a polling approach to speech recognition, a speech input is
converted into a string of labels in which each label is selected
from a predefined alphabet of labels indicative of some acoustic
prototype of speech.  For a subject word in a vocabulary, each label
has a vote.  The vote reflects the likelihood of a model for the
subject word producing a specific label.  When a string of labels is
generated in response to an  unknown speech input, the votes for each
label (of the string) for the subject word may be added to provide a
score for the word.  This procedure is repeated for each word in the
vocabulary.

      When the polling procedure is completed, a short list of words
ordered by vote score is obtained.  A major goal is to find votes, if
possible, such that the correct word is always at the top of the list
and has a score which is much better than any other (incorrect) word.

      The algorithm specified below adjusts the votes whenever the
correct word is not at the top of the list or when another
(incorrect) word has a polling score almost as high as that of the
correct word.  Applying these adjustments iteratively over some
training data leads to votes which are optimized for the polling
procedure.

      The existence of some training data (labels) which include one
or more utterances of each word in the vocabulary is assumed.

      The steps of the algorithm are as follows:
      Step 1:  For each word W in the vocabulary there is an
adjustment counter CW and an adjustment vector AW with N elements,
where N is the number of labels in the label alphabet.  All counters
CW are set to zero, and all vectors AW to...