Original Publication Date: 1976-Jun-21
Included in the Prior Art Database: 2007-Mar-30
Software Patent Institute
Bayer, Rudolf: AUTHOR [+3]
In this section we describe a modification of Simple Prefix B-trees with the goal to further reduce the size, mainly the height of the index-part of such trees. Assume that the ordering of the keys is the lexicographical order according to the collating sequence of the alphabet, over which the keys are defined. a Let T(P) be the subtree of index and leaf-pages with root P. Reconsider the index-part of a Simple Prefix B-tree. The tree structure determines for each page P a largest lower bound h(P) and a smallest upper bound p(P), such that for all keys x or separators s which areor might be stored in T(P), the following holds: Let be the smallest letter in the alphabet and let .n be larger than a l l keys being considered. Then X and y are defined as follows: For the root R of the tree: To define h and P for other nodes, let P be a branch-node with .' lower and upper bounds X(P) and u(P), respectively, and the following structure: P: Po *PR are pointers to the sons of P which are branch- or leaf-nodes.
June 21, 1976
Computer Science ~udolf Bayer*
. . Karl Unterauer**
- IBM Research Laboratory
San Jose, California 95193
ABSTRACT: Two modifications of B-trees are described, Simple Prefix B-trees and Prefix B-trees. Both store only parts of keys, namely prefixes, in the index part of a B*-tree. In
Simple Prefix 8-trees those prefixes are selected carefully to minimize their length. In Prefix B-trees the prefixes need not be fully stored, but are reconstructed as the tree is searched. Prefix B-trees are designed to combine some of the advantages of B-trees, digital search trees, and key
compression techniques, but to reduce the processing overhead of compression techniques.
* Permanent address: Technical University Elmich, Institute for Informatics, 8 Munich 2, Arcisstr 21, West Germany.
** Technical University Munich, 8 Munich, W. Germany.
1. Simple Prefix B-Trees
We asslrme that the reader is familiar with B-treesly6 and with a
variation of B-trees, called B*-trees. 697 In B*-trees the records of a file together with the keys identifying them are only stored in leaf-nodes of the tree structure. We call the non-leaves branch-nodes or branch-pages. Leaves can be linked to their neighbors to allow sequential processing of the leaves without using the branch-nodes of the B*-tree.
We call the part of a B*-tree consisting only of the branch nodes the B*-index, the ordered set of leaves the B*-file. In the B*-index some keys, which appear again in the B*-file with their associated records, are repeated without their records.
The following observation is important for the rest of this paper: The keys stored in the B*-index are only used to direct the search algorithm and to determine in which subtree of a given branch-node a key and its associated record w i l l be found, if it is in the tree at all.
It is now a fairly obvious observation that we need not necessarily use the keys in the B*-file to construct the B*-index. Instead, we
can use other strings, constructed to have desired properties, for building up the equivalent of the B*-index of a B*-tree.
To give a simple example, assume that a leaf contains keys Bigbird, Burt, Cookiemonster, Ernie, Snuffleopogus Y
and that inserting the key "Grouchf' with its record forces us to split this leaf into two as follows:
Bigbird, Burt, Cookiemonster Ernie, Grouch, Snuffleopogus
Instead of storing the key "Ernie" in the index, it obviously suffices to use one of the one-letter strings "D", "E" for the same purpose.
In general, we can select any string s with the property
Cookiemonster < s I
Ernie (1) and store it in the index part to separate the...