Browse Prior Art Database

Improved Generic Location of Elements in a Hash Table

IP.com Disclosure Number: IPCOM000117302D
Original Publication Date: 1996-Jan-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 100K

Publishing Venue

IBM

Related People

Richards, AJ: AUTHOR

Abstract

The use of a hash table enables rapid full-key location in storage systems. However, a set of keyed elements may need to be both addressed by use of the full keys and also by generic keys and browsed in alphabetical order.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Improved Generic Location of Elements in a Hash Table

      The use of a hash table enables rapid full-key location in
storage systems.  However, a set of keyed elements may need to be
both addressed by use of the full keys and also by generic keys and
browsed in alphabetical order.

      These operations and the addition of new elements needs to be
done with efficiency in both use of CPU instructions and the minimal
set of storage areas touched (to reduce storage subsystem demands).
Existing systems already use a hash table to optimize full key
locates, but uses a chain of elements when adding and deleting
elements to allow for generic location.  This is inefficient as the
number of elements increases, because large portions of the set of
elements need to be addressed (either to add an element to the chain,
or to find the element with a higher key in generic locate).  It is
desirable to increase the performance of these parts without
disturbing the hash table.  In addition, the mechanism must remain
efficient even though large parts of the set of keyed elements may
change, maintaining the mechanism must not be too onerous even with
these dynamic changes, and substantial extra storage should not be
required (such as adding pointers to each of the set of keyed
elements).

      These requirements can be met as described below by adding an
array of 'ranges' to the implementation.  Each array element consists
of a pointer and a counter, and a record is kept of the number of
keyed elements in the set, the number of array elements, the number
of array elements in use, and the optimal range size.

Initialization

The array is initialized as follows:
  1.  The total number of elements is divided by the total number of
       array ranges and the result is halved.  This number is
increased
       to be larger than a minimum number.  This is the optimal range
       size.
  2.  The set of keyed elements are split up into alphabetical ranges
       of the optimal range size.  Each array pointer points to the
       element that starts the next range, and the counters are set
to
       the number of elements in each range.
  3.  The total number of ranges made by this process is determined.

Searching for position

      When a requestor needs to know where a particular key (or
partial key) would fit in the set of keyed elements, a binary search
of the used part of the array is performed, using the keys pointed to
by the array elements.  Then the exact point is determined by moving
forward from the element pointed to by the located range element
until the elements is found (which must be before the next array
element).  This reduces the search length from...