Browse Prior Art Database

Keyed File Hashing With Bit Rotation

IP.com Disclosure Number: IPCOM000101748D
Original Publication Date: 1990-Aug-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 4 page(s) / 111K

Publishing Venue

IBM

Related People

James, W: AUTHOR [+3]

Abstract

This article describes a hashing technique called bit rotation. Bit rotation improves distribution of keys that contain repetitive patterns.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Keyed File Hashing With Bit Rotation

       This article describes a hashing technique called bit
rotation.  Bit rotation improves distribution of keys that contain
repetitive patterns.

      Keyed disk files are used when quick retrieval of information
by key is required.  The access of keyed records by hashing the key
to a block number is the quickest method available.  This method
allows access to a large percentage of the records with one disk
read.

      Keyed file efficiency depends on the number of chained records
in the file.  Chained records result when enough keys hash to a block
to cause the block to overflow.  More than one disk read is required
to access a chained record. The percentage of chained records is a
measure of the efficiency and usefulness of a keyed file.

      All keyed files would be very efficient if all keys were random
in nature.  Unfortunately, most "real world" keys contain repetitive
patterns which cause poor record distribution and large percentages
of chained records.

      The hashing algorithm transforms the key field in a record to a
block number in a keyed file.  The simplest type of hashing would be
to divide a binary key by a number called a randomizing divisor to
obtain a remainder.  The remainder becomes the home block number for
the binary key. For this to work correctly, the randomizing divisor
must not be larger than the total number of home blocks in the keyed
file.

      Selection of an efficient randomizing divisor is necessary to
obtaining even distribution of records to home blocks.  Usually prime
numbers make the best randomizing divisors.  An accepted way of
selecting a randomizing divisor is to select the largest odd number
that is less than the number of blocks in the file, and which is not
evenly divisible by any single-digit number.

      Since few files have binary keys, a transformation operation is
usually required to produce a binary number for division by the
randomizing divisor.  The transformation process used for this
article involves use of "Exclusive OR" operations to fold the bits of
the key into a 32-bit number. Each byte of the key is folded into a
byte of the 32-bit number by the "Exclusive OR" instruction.  The
order of processing is shown in the figure.

      After folding, the high-order bit of the folded key is zeroed
to prevent a negative result.  The folded key is then divided by the
randomizing divisor to get the home block number.

      This method of transforming a key to a block number is called
the XOR hashing algorithm in this article.

      These are the steps that are added to the XOR hashing algorithm
to pro...