Browse Prior Art Database

# Fast Table Joining in Relational Data Bases and Fast Row Retrieval

IP.com Disclosure Number: IPCOM000039669D
Original Publication Date: 1987-Jul-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 5 page(s) / 50K

IBM

## Related People

Krieg, GJ: AUTHOR

## Abstract

Fast row retrieval in relational data bases is possible by transferring the technique of known full text information retrieval systems with the concept of occurrences to relational data bases. Domain support is the key to efficiently processing equi-join, one of the most important operators of relational algebra. From a purely formalistic point of view, two tables (relations) are joined by considering a set of row pairs from both tables and by combining each row of the first table with each row of the second table, forming what mathematicians call a Cartesian product of two sets of rows. For equi-joins, selection from this product is made by an equal comparison on equal.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 28% of the total text.

Page 1 of 5

Fast Table Joining in Relational Data Bases and Fast Row Retrieval

Fast row retrieval in relational data bases is possible by transferring the technique of known full text information retrieval systems with the concept of occurrences to relational data bases. Domain support is the key to efficiently processing equi-join, one of the most important operators of relational algebra. From a purely formalistic point of view, two tables (relations) are joined by considering a set of row pairs from both tables and by combining each row of the first table with each row of the second table, forming what mathematicians call a Cartesian product of two sets of rows. For equi-joins, selection from this product is made by an equal comparison on equal. Thus, the processing time of such comparisons increases with the "area" of the Cartesian product (the number of rows in the first table multiplied by the number of rows in the second table). Such joins are essential in relational algebra: no non-trivial data base design can do without them. The joins are effected much faster by using domains which are equivalent to the dictionary in information retrieval systems and inverted files. There is a principle in the relational world, according to which tables contain only data and no hidden pointers. A prejudice existing with respect to inverted files is that they are extremely difficult to update. The solution to online updating of inverted files is described in IBM Technical Disclosure Bulletin 27, 3216-3220 (November 1984). It is not so trivial to recognize that the performance problem of the relational join operator, the online update problem of inverted files and the software implementation problem of an associative memory are basically the same. For resolving those problems, it is necessary to combine elements of information retrieval of textual information, relational data bases and inverted file updates. First, the relational data base R and the information retrieval IR are compared. For mapping terms in IR and R, please refer to Fig. 1. According to Fig. 1, the following equivalents are to be determined: the equivalents for dictionary, occurrence and occurrence string in R. The IR technique can also be used in the relational world. What is the equivalent of a textual word in the R world? The table and field designations, concatenated with the field contents of every field in a row of a relational data base, form as many artificial words as there are fields in a row (counting duplicates only once) (see Fig. 5 - 'T,acc,007' or 'U,book, 17.00'). What is the equivalent of the IR dictionary in R terms? A dictionary in relational terms is a sorted sequence of all values in all fields of all rows of all tables in the data base collection. In other words, a dictionary exactly defines the domains of all fields of all rows in all tables (see Fig. 5 and compare with Fig. 4). What is the equivalent of a document in the IR world? A document in the IR world co...