Browse Prior Art Database

Building Baseforms for a New Application Domain

IP.com Disclosure Number: IPCOM000104323D
Original Publication Date: 1993-Apr-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 74K

Publishing Venue

IBM

Related People

Davies, K: AUTHOR [+3]

Abstract

In order to port the Tangora Automatic Speech Recognizer to a new application domain, one must create baseforms for the domain-specific words. Since words may be pronounced in different ways by different speakers, a way to determine how many baseforms and what they should be is needed.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Building Baseforms for a New Application Domain

      In order to port the Tangora Automatic Speech Recognizer to a
new application domain, one must create baseforms for the
domain-specific words.  Since words may be pronounced in different
ways by different speakers, a way to determine how many baseforms and
what they should be is needed.

     When moving the Tangora Automatic Speech Recognizer (ASR) from
an office correspondence domain to a different domain (e.g. medical,
insurance), one must create baseforms for the new domain.  In
general, it is not sufficient to build baseforms for just the new
words in the domain, for it is possible for words in the new domain
to have different pronunciations from words with identical spellings
in the office correspondence domain.  Building baseforms using
utterances from a single speaker is not a problem [1].  Because of
variations in the way the domain specific words are spoken by users
(as opposed to the most general English words), it is necessary to
make the baseforms from a sampling of typical users.  Thus, one must
use a multiple utterance algorithm to select the baseforms from the
individual baseforms created by each speaker.  Two problems exist
though, determining the statistics to be used for running the
multiple utterance algorithm, and constraining the lists from each
speaker to only baseforms that score well.

     Two novelties are disclosed.  First, a set of smoothed
statistics is attained from only the speakers used to generate the
baseforms.  Second, after running the baseform generation program,
the lists of candidate baseforms are truncated to the top n
baseforms.

     In order to build baseforms for an Automatic Speech Recognizer
(ASR), the following steps are performed:

1.  The vocabulary of the new domain is determined.
2.  N speakers read a training script and training is performed for
   ...