Browse Prior Art Database

Fast, Approximate Fenemic Baseform Generation From a Single Utterance

IP.com Disclosure Number: IPCOM000102655D
Original Publication Date: 1990-Dec-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 1 page(s) / 45K

Publishing Venue

IBM

Related People

Davies, K: AUTHOR [+3]

Abstract

Disclosed is a computationally simple method of generating an approximate fenemic baseform given a single utterance of a word, to dynamically add new words to the active vocabulary of a speech recognition system (1). The procedure takes a string of acoustic labels A = (a1, a2, as input and produces a string of fenemic phones F = (f1, f2, as output.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 69% of the total text.

Fast, Approximate Fenemic Baseform Generation From a Single Utterance

       Disclosed is a computationally simple method of
generating an approximate fenemic baseform given a single utterance
of a word, to dynamically add new words to the active vocabulary of a
speech recognition system (1).  The procedure takes a string of
acoustic labels A = (a1, a2, as input and produces a string of
fenemic phones F = (f1, f2, as output.

      Each label in the input sequence is mapped to a fenemic phone
through a one-dimensional translation table T: fi = T(ai), where T
has K entries, and K is the size of the input label alphabet (2,3).
This mapping is independent of acoustic and fenemic context, and the
length of the baseform, M, is equal to the length of the input
sequence, N.

      The translation table T can be generated by assigning to each
acoustic label the fenemic phone that is most likely, a-priori, to
produce the given label.  This table can easily be derived from the
label output probabilities used in Markov-model-based speech
recognizers, and can be derived for each speaker by using
speaker-dependent statistics or generically by using
speaker-independent statistics.

      The translation table can be precomputed, requires only a
single entry per possible acoustic label value, and produces a
fenemic baseform in linear time.

      References
(1) L. R. Bahl, P. F. Brown, P. V. de Souza, R. L. Mercer and
M. A. Picheny, "A method for the construction of...