Browse Prior Art Database

Selecting Splits via Mixed Multinomial Modeling for Phonological Decision Trees

IP.com Disclosure Number: IPCOM000108378D
Original Publication Date: 1992-May-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 3 page(s) / 114K

Publishing Venue

IBM

Related People

Bahl, LR: AUTHOR [+6]

Abstract

Described is a criterion for selecting splits for constructing a phonological decision tree. Models for words in continuous speech are derived by constructing decision trees that encode the rules for pronunciation of each word in different contexts (*). Such trees are constructed by using data obtained from several thousand utterances of each phone in different contexts. We start with all the data at the root node of the tree. The decision tree construction procedure is given a set of possible questions that can be asked at each node to split it into two nodes, one having all the samples that answer yes to the question and the other having the rest. The iterative procedure splits the nodes by trying all questions and selecting the one that maximizes some evaluation function.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Selecting Splits via Mixed Multinomial Modeling for Phonological Decision Trees

       Described is a criterion for selecting splits for
constructing a phonological decision tree.  Models for words in
continuous speech are derived by constructing decision trees that
encode the rules for pronunciation of each word in different contexts
(*).  Such trees are constructed by using data obtained from several
thousand utterances of each phone in different contexts.  We start
with all the data at the root node of the tree.  The decision tree
construction procedure is given a set of possible questions that can
be asked at each node to split it into two nodes, one having all the
samples that answer yes to the question and the other having the
rest.  The iterative procedure splits the nodes by trying all
questions and selecting the one that maximizes some evaluation
function.  The method adopted in (*) is to fit a model to the data at
the two offspring nodes and compute joint probability of observing
the strings at these nodes, given these models.  The split that
maximizes this probability is selected as the best one.

      In this invention is described a very natural model that can
fit to the data at each node.  It is shown that under one assumption,
the evaluation function derived using these models is the same as the
one used in (*) which was arrived at under different assumptions.

      Let F be the number of possible acoustic labels.  Each sample i
at a node is some string of labels            . Assume that in this
string    is the number of times label j occurs.  That is, ( is the
histogram of counts for the sample i.

      If we assume that the source for the sample strings is
memoryless, then conditionally on the string length , a vector of
counts (histogram)              has the multinomial distribution

                            (Image Omitted)

      Denote by    the probability that the string length .  Then the
joint likelihood of the sample of histogram vectors             and
sample sizes          is

                            (Image Omitted)

      Enforcing the constraint that each of p and q have nonegative
components and that each vector sums to one, it is not difficult to
show that this likelihood is maximized if w...