Browse Prior Art Database

Method for the Removal of Outliers from Data Used in the Construction of Phonological Rules

IP.com Disclosure Number: IPCOM000108122D
Original Publication Date: 1992-Apr-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 85K

Publishing Venue

IBM

Related People

R Bahl, L: AUTHOR [+5]

Abstract

This invention presents a method for the removal of outlying strings from the data that is used in the automatic construction of phonological rules for continuous speech modeling. In order to construct accurate models for continuous speech that take into account the contextual variations commonly found in speech, a set of phonological rules are derived automatically from recorded data by constructing binary decision trees (1). Utterances of thousands of sentences by several speakers are aligned against the phonetic models corresponding to the sentences spoken using the Viterbi algorithm (2). The acoustic label strings that correspond to the utterance of each phone in the vocabulary are then identified and these are annotated with the context information.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Method for the Removal of Outliers from Data Used in the Construction of Phonological Rules

       This invention presents a method for the removal of
outlying strings from the data that is used in the automatic
construction of phonological rules for continuous speech modeling.
In order to construct accurate models for continuous speech that take
into account the contextual variations commonly found in speech, a
set of phonological rules are derived automatically from recorded
data by constructing binary decision trees (1).  Utterances of
thousands of sentences by several speakers are aligned against the
phonetic models corresponding to the sentences spoken using the
Viterbi algorithm (2).  The acoustic label strings that correspond to
the utterance of each phone in the vocabulary are then identified and
these are annotated with the context information.  This database of
strings and their associated contexts is used to construct the binary
decision trees.  These trees identify the different pronunciations of
a given phone in different contexts.

      Since some strings might be misaligned by the Viterbi
algorithm, the data that is used in growing the trees will be
corrupted by such erroneous label strings.  Since these will
typically be very different from the strings that are produced by the
utterances of the given phone, the procedure that learns phonological
rules from this data is apt to be mislead and might treat these as
significant phonological events, thus possibly missing genuine events
that we would like it to learn.  Hence, we would like to have an
effective procedure that removes such outlying label strings from the
data that is used to grow the decision trees.  This invention
describes just such a procedure.  This procedure also attempts to
make sure that we do not discard strings that represent rare
realizations of phones resulting from genuine contextual effects.

    ...