Browse Prior Art Database

Spiral Storage: Incrementally Augmentable Hash Addressed Storage

IP.com Disclosure Number: IPCOM000048692D
Original Publication Date: 1982-Mar-01
Included in the Prior Art Database: 2005-Feb-09
Document File: 4 page(s) / 57K

Publishing Venue

IBM

Related People

Martin, GNN: AUTHOR

Abstract

Several well-known techniques for organizing data so that they may be retrieved on some key attribute divide the data space into k equal parts and use a function that maps all possible values of the key attribute onto the integers 0 to k-1. When storing a data item whose key maps to n, we attempt to place the item in or near the n + 1'th part of the data space.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 38% of the total text.

Page 1 of 4

Spiral Storage: Incrementally Augmentable Hash Addressed Storage

Several well-known techniques for organizing data so that they may be retrieved on some key attribute divide the data space into k equal parts and use a function that maps all possible values of the key attribute onto the integers 0 to k-1. When storing a data item whose key maps to n, we attempt to place the item in or near the n + 1'th part of the data space.

However, as the number of data items that we wish to store increases, we must increase the size of our data space, modify the mapping function to suit the new value of k, and reorganize the items already stored in the data space. Traditionally this has been an expensive operation, and this has limited the use of hashing to special situations. This article describes a simple mapping function that minimizes the impact of reorganziation.

The basic technique is to map the data items into the data space so that they tend to be more dense at one end than at the other. To expand the space we extend it at the less dense end, freeing (less) space at the more dense end. Items that used to occupy the space that is freed are spread more sparsely over the new space instead. This process is illustrated in Fig. 1.

The top line of the figure illustrates 24 data items mapped into a data space of 11 locations using an algorithm that tends to place twice as many items in the leading locations as in the trailing locations. The items are remapped into 12 locations by removing the leading location and splitting the contents between two new trailing locations; this process is repeated until the data space consists of 19 locations.

We shall now describe an algorithm designed to achieve this biased mapping, and draw a geometrical analogy. To start with, we assume that it is always possible to store a data item in the location suggested by the mapping algorithm.

Let us define some symbols.

K is all possible key values, k Epsilon K.

R is the real numbers, r Epsilon R.

J is (0, 1), j Epsilon J. In other words, j greater than or equal to 0 less than 1. r equals r- (see original) r (see original)

H is a hash function, mapping K onto J 'at random'. As usual, if we can choose an H that maps K onto J more evenly than 'random', we may profit by it.

Juxtaposition binds functions to their arguments, and we use the lambda notation, thus H equals Lambda k.Hk.

In our analogy we interpret Hk as a fraction of one revolution, and call Hk the direction of the data item with key k. To map directions to locations, we imagine the storage medium laid out along an exponential spiral (Fig. 2), the unit length of the spiral representing one storage location. The data space occupies one complete revolution of the storage, from point A to point B; thus every direction maps to exactly one location in the data space. Three data items have been

1

Page 2 of 4

shown, with keys k1, k2, and k3; their directions are Hk1, Hk2, and Hk3, and the locations at which they ar...