Browse Prior Art Database

Method for Representing Baseforms as Graphs for Use in a Speech Recognition System

IP.com Disclosure Number: IPCOM000113171D
Original Publication Date: 1994-Jul-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 83K

Publishing Venue

IBM

Related People

Cohen, PS: AUTHOR [+2]

Abstract

A method for applying context-sensitive phonological rules to a phonological baseform string so as to produce a directed graph representing the set of possible phonological surface forms for the string (i.e., its possible pronunciations), is described in [1]. Most words (at least in English) are adequately represented by a single phonological baseform string. However, there are many words that require more than one phonological baseform string. For example, the vowel sound in the can be that of each or that of the final vowel of sofa; the first vowel of either can be that of each or that of ice; and the final consonant sound of blouse can be that of bus or buzz. Previously this issue has been addressed either by writing separate baseforms or by writing ad hoc rules for each case as it arises.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Method for Representing Baseforms as Graphs for Use in a Speech Recognition
System

      A method for applying context-sensitive phonological rules to a
phonological baseform string so as to produce a directed graph
representing the set of possible phonological surface forms for the
string (i.e., its possible pronunciations), is described in [1].
Most words (at least in English) are adequately represented by a
single phonological baseform string.  However, there are many words
that require more than one phonological baseform string.  For
example, the vowel sound in the can be that of each or that of the
final vowel of sofa; the first vowel of either can be that of each or
that of ice; and the final consonant sound of blouse can be that of
bus or buzz.  Previously this issue has been addressed either by
writing separate baseforms or by writing ad hoc rules for each case
as it arises.  Either of these two methods can serve as an
approximation, but both suffer from the following flaws: a) they do
not accurately model the linguistic facts; b) they prevent accurate
statistics from being gathered; c) they are prone to error.

      As has just been delineated, [1]  describes a method for
applying a set of phonological rules to a phonological baseform
represented as a phonemic graph.  In practice, however, because of
the difficulty of representing baseforms as phonemic graphs, the rule
applier was implemented by providing for strings alone (i.e.,
degenerate graphs) as baseforms [2,3].  Thus, for example, one must
provide two separate baseforms for each of either and neither, even
though in a single talker, the choice of the first vowel in these two
words is highly correlated.  By using unrelated representations for
these different pronunciations, we lose the possibility of modeling
this correlation.

      As described in [2,3], the graph resulting from the application
of phonological rules to a phonemic string can be represented as a
sequence of c-links [3].  Each c-link is an irreducible graph
containing a single initial and a single final node.  In the usual
application of the phonological-rule applier described in [1], these
c-links arise naturally as the result of rule application.  This
invention uses...