Browse Prior Art Database

Size Independent Hashing Scheme for Rectangular Arrays

IP.com Disclosure Number: IPCOM000087228D
Original Publication Date: 1976-Dec-01
Included in the Prior Art Database: 2005-Mar-03
Document File: 2 page(s) / 15K

Publishing Venue

IBM

Related People

Crowder, HP: AUTHOR [+3]

Abstract

Described herein is a technique for allocating storage for two-dimensional rectangular arrays which enjoys the following properties: (1) The technique is insensitive to the dimensions of the array being stored; that is, it can be used to store/retrieve any two-dimensional array. (2) As a corollary to point (1), the technique is well-suited for storing arrays that can expand dynamically. (3) The technique generalizes easily to a technique for storing rectangular arrays of any desired dimensionality, while retaining size independence.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Size Independent Hashing Scheme for Rectangular Arrays

Described herein is a technique for allocating storage for two-dimensional rectangular arrays which enjoys the following properties: (1) The technique is insensitive to the dimensions of the array being stored; that is, it can be used to store/retrieve any two-dimensional array. (2) As a corollary to point (1), the technique is well-suited for storing arrays that can expand dynamically. (3) The technique generalizes easily to a technique for storing rectangular arrays of any desired dimensionality, while retaining size independence.

When arrays are to be accessed by a sequence of independent probes to array positions, one strategy for storing/retrieving arrays is by using a hashing scheme based on position coordinates as keys. Thus one would retrieve the contents of array position a(ij) via the coordinate pair <i,j>.

Array-hashing schemes can be made independent of the size of the arrays to be stored if they are designed to use external collision resolution. In such a scheme, one typically has a hashing function HASH that associates with each (potential) array position <i,j> a "bucket" HASH(i,j). The bucket is some chained data structure such as a linear list or a balanced search tree. Such external hashing schemes are well known; the reader is referred to [*] for details concerning several alternative constructions for external hashing schemes. The efficiency of an external hashing scheme depends on a judicious choice of both the bucket organization and the hashing function.

The choice of a bucket organization depends quite heavily on one's specific computing environment. For example, balanced search trees have accessing characteristics that are dramatically superior to those of linear lists, but linear lists require somewhat less storage and much less bookkeeping than do balanced search trees. These issues are discussed with appropriate literature references in the cited reference.

Presented here is a size-independent hashing function for rectangular arrays that is demonstrably efficient over a broad range of bucket organizations, including both linear lists and balanced search trees.

The function is defined as follows. For any positive integers i,j,

H(i,j) = j + (i-1)2 /[log/2/j]/.

An alternative, but probably equivalent definition of H, is given by: The binary representation of H(i,j) is obtained by concatenating, in this order, the binary representation of i followed by the shortest binary representation of j with the initial 1 removed. More explicitly, if 1Alpha(1)...Alpha(m) and 1Beta(1)...Beta(n)(Alpha(k),Beta(l)Epsilon{0,1}) are binary representations of i and j, respe...