Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

B-Tree Index Methods in a Binary Relational Database

IP.com Disclosure Number: IPCOM000047756D
Original Publication Date: 1983-Dec-01
Included in the Prior Art Database: 2005-Feb-08
Document File: 4 page(s) / 27K

Publishing Venue

IBM

Related People

Pullin, DJ: AUTHOR

Abstract

This article describes the management of multiple indexes within a single fixed-block-size file, such as may be used in a Binary Relational Database, implemented on a paging system. A binary relational database is characterized by storing large numbers of "triples" of data such as (S,R,T), where S is an entity that is related, by a named relation, R, to the second entity, T. The triples related to a specific entity, say, S, form a tree structure. The branches from each node of the tree are ordered, and are implemented in the form of a keyed index. For example: Each triple, such as (S,R2,T22), has a corresponding triple (T22,R2',S) representing the inverse relation. (Depending the implementation of the Binary Relational Database there may be up to 5 "inverses" representing the 6 possible access paths to a triple.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 38% of the total text.

Page 1 of 4

B-Tree Index Methods in a Binary Relational Database

This article describes the management of multiple indexes within a single fixed-block-size file, such as may be used in a Binary Relational Database, implemented on a paging system. A binary relational database is characterized by storing large numbers of "triples" of data such as (S,R,T), where S is an entity that is related, by a named relation, R, to the second entity, T. The triples related to a specific entity, say, S, form a tree structure. The branches from each node of the tree are ordered, and are implemented in the form of a keyed index. For example: Each triple, such as (S,R2,T22), has a corresponding triple (T22,R2',S) representing the inverse relation. (Depending the implementation of the Binary Relational Database there may be up to 5 "inverses" representing the 6 possible access paths to a triple.) Further, each of the entities T11, T12, etc., can be connected to other entities, or to each other, or even to S, by other relations. The effect of these relations is that the structure is a network with localized tree structuring. This structure is to be represented subject to the following objectives:
1. The physical dataset on which the structure is to

reside must use a fixed block size, to be amenable

to hardware and software paging. 2. No advanced knowledge is available about the data be zero, a few, or a few million. The primitive

update operations are only: add a triple and

delete a triple. 3. The performance of a typical traverse of the network

should be optimal. The key performance determinant

is the number of physical pages referenced (the reference

set) and hence the expected number of pages to be fetched

from DASD for a given Real Storage allocation. 4. Because of the possibility of an enormous number of pointers pointing to a given entity node, entity nodes

cannot be moved between pages once they have been

positioned. An intra-page indirection mechanism is assumed

so that nodes can be moved within a page (to avoid storage

fragmentation (entity pointer/page name indirection, and

modification of all pointers) are excluded on

performance grounds. To handle the arbitrary, and possibly large, branching factor at any node, the branches are implemented as a B-tree. (See D. Comer, "The Ubiquitous B-Tree," ACM Computing Surveys 11, 2, 121-137 (1979).) In effect, this introduces extra nodes into the tree structure - the nodes of the B-tree. Unlike the nodes representing entities (S, T11, T12, etc., above) these nodes have no database significance and cannot be part of a general network. Specifically, nothing points to a B-tree node other than the parent block in the B-tree, and a non-leaf B-tree node points only to daughter blocks. The result of implementing a B-tree is nodes of varying and unpredictable sizes, and it is the management of these modes - in particular, when should a B-tree node be split, and how should nodes be placed on physical pages so as to satisfy...