Browse Prior Art Database

Delimiter Bitmaps for the KANA Address Post-Processing

IP.com Disclosure Number: IPCOM000113202D
Original Publication Date: 1994-Jul-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 55K

Publishing Venue

IBM

Related People

Kita, Y: AUTHOR

Abstract

Disclosed is the program that recognizes the address expressions written with the Kata-Kana Characters which are slightly different from the ones contained in the address dictionary by using the Bitmap flags that show the changeable features of each address candidate. When the post-processor gets 1 candidate entry in the dictionary, it dynamically creates the possible address expression from the word and this bitmap and estimates the character candidate array, that are the output of character recognition program, with these candidates totally more than 2 times.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 68% of the total text.

Delimiter Bitmaps for the KANA Address Post-Processing

      Disclosed is the program that recognizes the address
expressions written with the Kata-Kana Characters which are slightly
different from the ones contained in the address dictionary by using
the Bitmap flags that show the changeable features of each address
candidate.  When the post-processor gets 1 candidate entry in the
dictionary, it dynamically creates the possible address expression
from the word and this bitmap and estimates the character candidate
array, that are the output of character recognition program, with
these candidates totally more than 2 times.

      The delimiter bitmaps stand for the part of address expression
that causes the difference of word length when it is taken for
another expression.  Each bit of this flag corresponds to the variety
of suffix and prefix such as "Mura" , "Oh-aza" or so.  Fig. 1 shows
an example.  Three addresses are found in this example and they mean
"Aioi-machi","Asahi-choh" and "Asada-mura".  That is the bitmaps,
0x0020,0x0010 and 0x0008 corespond to the suffixes,"Machi","Choh" and
"Mura" respectively.  Here, assume the user writes "Asahi-machi" to
mean "Asahi-choh".  To recognize the former as the latter, the
post-processor makes three candidates dynamically from the
term,"Asahi" and the flag,0x0010, suposing the 3 cases from the
Bitmap as follows:(1) The user may write the proper suffix,"Choh",
(2) The user may takes the suffix for another one,"Machi...