Browse Prior Art Database

Speeding the Best-First Search for Determining Phonetic Baseforms

IP.com Disclosure Number: IPCOM000122297D
Original Publication Date: 1991-Nov-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 2 page(s) / 89K

Publishing Venue

IBM

Related People

Epstein, ME: AUTHOR

Abstract

In the Tangora (1,2,3), a best-first search algorithm is used to generate a phonetic baseform for a word given its spelling and one utterance. This algorithm was designed for accuracy and takes over 2.5 seconds per word on a RISC System/6000* (RS/6000). Disclosed are improvements to the search procedure that reduce this to 1.8 seconds.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Speeding the Best-First Search for Determining Phonetic Baseforms

      In the Tangora (1,2,3), a best-first search algorithm is
used to generate a phonetic baseform for a word given its spelling
and one utterance.  This algorithm was designed for accuracy and
takes over 2.5 seconds per word on a RISC System/6000* (RS/6000).
Disclosed are improvements to the search procedure that reduce this
to 1.8 seconds.

      The algorithm for determining phonetic baseforms (1) used in
the Tangora (2) requires a best-first search (3,4). This is used
because it quickly finds a solution that is near the optimal.  Two
discoveries were made that suggested two enhancements to speed the
search so that it runs in under 2 seconds.
1.   The space-optimizing best-first search saves significant space
and time over using a pure best-first search by using a backtracking
algorithm.  But it was discovered that 13 percent of the nodes were
being re-expanded, some many times.
2.   The search algorithm continues to search for the best answer,
even after ones are found, in the hopes of finding a better one.  It
was discovered that rarely is a better answer found.

      Thus, the search was speeded up by 30% by implementing the
following improvements:
1.   Add nodes to the search tree that are within delta1 of the best
node in the priority queue of nodes to be expanded.  Later, nodes
within delta2 < delta1 of the best node will be expanded.  The
original algorithm assumed delta1 = delta2.  One could also use
delta3 > delta2 to add nodes to the search tree each subsequent time
a node is re-expanded, but for the present application this did not
prove worthwhile.
2.   The search is allowed to run to completion.  If more time is
available, more paths are explored up until either the time limit is
reached, or no more nodes within delta2 of the best answer are
available.  If no more time is available, a specified number of
additional iterations are performed to allow any other parts close to
completion to have a chance to complete.

      The original best-first search algorithm was run on 80
utterances of 20 words on an RS/6000.  The original algorithm used
delta1 = delta2 = 2.5.  The 80 utterances used 30,767 tree nodes,
4072 of them being re-expanded.  A delta1 = 3.5 when nodes are
initially expanded resulted in 36,283 tree nod...