Browse Prior Art Database

Optimized Key Compression

IP.com Disclosure Number: IPCOM000081073D
Original Publication Date: 1974-Mar-01
Included in the Prior Art Database: 2005-Feb-27
Document File: 5 page(s) / 27K

Publishing Venue

IBM

Related People

Franaszek, P: AUTHOR

Abstract

In access methods in operating systems, particularly, in virtual machines, a key-compression algorithm is advantageously utilized to obtain a compact representation of the index. The use of such an algorithm results in a significant reduction in storage requirements and, consequently, a smaller number of page faults as well as a lower tree height for a given page and index.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 36% of the total text.

Page 1 of 5

Optimized Key Compression

In access methods in operating systems, particularly, in virtual machines, a key-compression algorithm is advantageously utilized to obtain a compact representation of the index. The use of such an algorithm results in a significant reduction in storage requirements and, consequently, a smaller number of page faults as well as a lower tree height for a given page and index.

In a known key-compression approach, the index is organized as a tree and, at each internal node, a mechanism is provided for determining which sub-tree contains the key for which a search is being made. The latter consists of a key K which was the largest in the sub-tree at the time that it was formed. The key K is compressed according to the algorithm described below considered in conjunction with the figure.

In considering the figure which conceptually depicts a level in the storage-access method index, the intervals A, B and C correspond to the j/th/ level in the index tree. K(1) is a key which is to be front and rear compressed, and K(0), K(2) and K(3) are, respectively, the largest keys in interval A and the smallest and largest keys in interval C. A key K(i) consists of: N digits:

K(i) = <K(i1),K(i2),....,K(iN)> [1]

Compression is performed in two steps:

(a) Rear compression

Let r be the most significant digit for which K(1r) not=

K(2r). Then, K(1(r+1)),K(1(r+2))... are eliminated. (b) Front compression Let s be the most significant digit for which K(0s) not=

K(1s). Then, K(11),K(12),...,K(1(s-1)) are eliminated.

Thus, for example, if K(0) = 3458621

K(1) = 3467832 [2]

K(2) = 3467943 then the compressed key K(1) = 678. To the compressed key there is appended information

which gives (i) the number of digits eliminated by front compression, and (ii) the length of the compressed key.

As new keys are added, they are inserted in locations which permit them to be found by the index. For example, if the key 3467833 were added above, it would be inserted in interval B.

The method described below is based upon the choosing of the intervals at each level of the index tree so as to increase the amount of compression that is obtained. The compression method described above is retained. The fact that the keys are sorted implies that in many cases a small perturbation in the point of partition between intervals will result in a very significant increase in the amount of compression.

1

Page 2 of 5

In considering optimized compression, reference is again made to the figure. In this connection, let it be assumed that each interval contains M items. In accordance with the described technique, rather than choosing the partition points between intervals in a manner dictated the free-space allocation requirements, the partition points are permitted to move by m items, where typically m << M. For example, if M = 200, the absolute value of m < 10 might be used with little effect on the space utilization, but with a significant impact on the amount of a...