Browse Prior Art Database

Hash Function for the Double Byte Code String Character

IP.com Disclosure Number: IPCOM000113726D
Original Publication Date: 1994-Sep-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 36K

Publishing Venue

IBM

Related People

Kita, Y: AUTHOR

Abstract

Disclosed is a hash method for the Double Byte Code String (DBCS) character. The hash function treats the DBCS character as the 4-elements-array of 4-bits-code characters and gives by the hash value of this array as the output. Using this hash function and hash index, the application program can quickly search DBCS words in dictionary.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 100% of the total text.

Hash Function for the Double Byte Code String Character

      Disclosed is a hash method for the Double Byte Code String
(DBCS)  character.  The hash function treats the DBCS character as
the 4-elements-array of 4-bits-code characters and gives by the hash
value of this array as the output.  Using this hash function and hash
index, the application program can quickly search DBCS words in
dictionary.

      The Figure shows an example of calculating hash value of an
DBCS character.  The hash function gets DBCS character(1) as
2-elements-array of 8-bits-code 4-elements-array of 4-bits-code
characters(3).  Then, the hash value calculated for this "array" by
ordinary way(4).  Here, p and M are fixed positive prime numbers and
the hash value is distributed between 0 and M-1.  This method can be
applied to the DBCS word searching program.  Assume that the
dictionary mustn't be sorted by the length of words for the
requirement of application program.  In other words, the words in the
dictionary can be sorted only by the initial letter like the
encyclopedia.  However,the variety of the initial letters is vast and
it needs some mechanism to select initial letter quickly.  By sorting
the words by the hash value of the initial letter of the words
primarily, application can reduce the variety of initial letters
drastically and search the object word quickly.

This hash method can be applied to the 3 or more bytes-code set
character.