Browse Prior Art Database

Huffman Code Heuristic to Determine General Join Sequences For Multi-join Queries

IP.com Disclosure Number: IPCOM000121505D
Original Publication Date: 1991-Sep-01
Included in the Prior Art Database: 2005-Apr-03
Document File: 2 page(s) / 80K

Publishing Venue

IBM

Related People

Chen, MS: AUTHOR [+3]

Abstract

In a relational database system, joins are the most expensive operations to execute especially as the database size is increasing nowadays. Some complex queries may take hours or even days to complete, and the system performance is thus degraded. Note that different execution sequences of joins in a query will require different execution times. It is therefore very important to develop an efficient scheme which determines effective execution sequences of joins to minimize the query execution time. The conventional methods to determine the execution sequence of joins to perform a multi-join query consider only the sequential join sequences, such as the one ((((R1, R2),R3),R4),R5) where (Ri,Rj) denotes the resulting relation of joining Ri and Rj .

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Huffman Code Heuristic to Determine General Join Sequences For Multi-join
Queries

      In a relational database system, joins are the most
expensive operations to execute especially as the database size is
increasing nowadays.  Some complex queries may take hours or even
days to complete, and the system performance is thus degraded.  Note
that different execution sequences of joins in a query will require
different execution times. It is therefore very important to develop
an efficient scheme which determines effective execution sequences of
joins to minimize the query execution time.  The conventional methods
to determine the execution sequence of joins to perform a multi-join
query consider only the sequential join sequences, such as the one
((((R1, R2),R3),R4),R5) where (Ri,Rj) denotes the resulting relation
of joining Ri and Rj .  The sequential join sequences, though easy to
determine, usually result in long query execution time.  To remedy
this, the join sequences of the general form, such as the one
((R1,R2),((R3, R4),R5)), are exploited in this disclosure.  In view
of the fact that the cost of a join operation is in proportion to the
sizes of its operands and the resulting relation, several heuristic
algorithms, based on the Huffman code [1], are developed to determine
the execution sequence of joins.  The heuristics are so designed that
they will attempt to execute joins of smaller resulting relations
first so as to minimize the sizes of intermediate relations derived
during the execution of a multi-join query.  The total execution time
of the query will then be reduced.  First, a heuristic scheme MC
based on the Huffman code to determine the join sequence...