Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method of Adding New Word Models to a Pre-existing Set

IP.com Disclosure Number: IPCOM000108379D
Original Publication Date: 1992-May-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 95K

Publishing Venue

IBM

Related People

Davies, K: AUTHOR [+4]

Abstract

Disclosed is a technique for adding new word models to a pre-existing speech recognition vocabulary.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Method of Adding New Word Models to a Pre-existing Set

       Disclosed is a technique for adding new word models to a
pre-existing speech recognition vocabulary.

      In order to add new words to a pre-existing speech recognition
vocabulary, new word models (fenemic baseforms) must be constructed.
However, the existing word models were derived (1,2) from the speech
of several talkers who are no longer available for additional
recordings.  The problem is to construct a set of word models
consistent with those already in existence.  This article describes a
method for constructing such word models consistent with the current
set of word models.
1.  Record all the words for which baseforms are to be created by at
least five talkers.  Also have talkers record a standard training
script.  There should be a total of at least 40 minutes of speech
from each talker to allow for proper parameter estimation (see
below).
2.  Train each talker in the usual fashion on the standard training
script.
3.  Locate the endpoints of each of the new words in the recorded
sentences.  This may be done by hand, with a silence detector, or by
alignment against a set of phonetic baseforms created by, say,
spelling-to-sound rules (3,4) using the statistics obtained in the
previous step.
4.  Obtain a Viterbi alignment of the training text against the
current set of fenemic baseforms for each talker. Obtain a set of
supervised prototypes for each talker by computing the mean and
covariance of each fenemic phone.
5.  Label the add-word script of each talker with the supervised
prototypes obtained in the previous step.
6.  Grow fenemic baseforms (5) for the add-word data using the
statistics used in baseform growing for the original vocabulary.
7.  Align each talker's complete set of data (both training and add-
word) against the set of baseforms obtained by merging the new
fenemic baseforms with the original baseforms.
8.  Compute new supervised prototypes for each talker, as in step 4.
9.  Repeat step 5 with the new supervised prototypes.
10. Repeat step 6 with the new supervised labels.

      To compare the above procedure against several other techniques
for obtaining baseforms, a new talker recorded the usual training
script and a test script containing a single utterance of each of the
new words, for a total of 1030 words.  The...