Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Automatic Synthesis of Fenemic Markov Word Models from Phonetic Markov Word Models

IP.com Disclosure Number: IPCOM000108116D
Original Publication Date: 1992-Apr-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 80K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+5]

Abstract

In speech recognition systems employing Markov word models, two types of word model are prominent: phonetic and fenemic. Phonetic word models can be obtained easily from phonetic transcriptions as listed in a dictionary. Fenemic word models are more accurate but require that each word in the vocabulary be uttered one or more times. Because this is inconvenient, efforts have been made to synthesize artificial fenemic word models from phonetic word models. This article provides a successful method of synthesizing fenemic word models from phonetic models.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Automatic Synthesis of Fenemic Markov Word Models from Phonetic Markov Word Models

       In speech recognition systems employing Markov word
models, two types of word model are prominent: phonetic and fenemic.
Phonetic word models can be obtained easily from phonetic
transcriptions as listed in a dictionary.  Fenemic word models are
more accurate but require that each word in the vocabulary be uttered
one or more times.  Because this is inconvenient, efforts have been
made to synthesize artificial fenemic word models from phonetic word
models. This article provides a successful method of synthesizing
fenemic word models from phonetic models.

      The existence of some training data consisting of about 5000
different words uttered by each of about 10 speakers is assumed.
Each speaker should utter the same 5000 words but not necessarily in
the same order.  It is also assumed that this training data has been
signal processed and converted to a series of parameter vectors.
And, finally, it is assumed that phonetic word models exist for each
word in the training data.  The purpose of this training data is to
provide examples of all phonetic phones in many different
phone-contexts, as realized by various different speakers.

      The steps for the algorithm are as follows:
      Step 1.   Perform fenemic supervised labelling on the training
data as described in (1), but using speaker-dependent means and
covariances for each iteration of re-labelling.
      Step 2.   Construct fenemic word models for each word in the
training data using the labels for Step 1 and the techniques of
(2,3).  These fenemic baseforms are required only for the purpose of
obtaining fenemic phone statistics at the next step.
      Step 3.   Using the labelled training data, partition each
phone in t...