Browse Prior Art Database

Two Step Approach to Optimize Parallel Execution of Multi-join Queries

IP.com Disclosure Number: IPCOM000107715D
Original Publication Date: 1992-Mar-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 3 page(s) / 151K

Publishing Venue

IBM

Related People

Chen, MS: AUTHOR [+3]

Abstract

In this disclosure, we explore inter-operator parallelism for the execution of a multi-join query in a multiprocessor system to minimize the query execution time. Inter-operator parallelism means that different join operators within a query can be executed in parallel by different clusters of processors. The subject of exploiting inter- operator parallelism for the execution of a multi-join query mainly consists of the following two major issues: (i) join sequence scheduling, i.e., scheduling the execution sequence of joins in the query, and (ii) processor allocation, i.e., determining the number of processors for each join obtained in (i) so that the execution time required for the query can be minimized.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 42% of the total text.

Two Step Approach to Optimize Parallel Execution of Multi-join Queries

       In this disclosure, we explore inter-operator parallelism
for the execution of a multi-join query in a multiprocessor system to
minimize the query execution time.  Inter-operator parallelism means
that different join operators within a query can be executed in
parallel by different clusters of processors.  The subject of
exploiting inter- operator parallelism for the execution of a
multi-join query mainly consists of the following two major issues:
(i) join sequence scheduling, i.e., scheduling the execution sequence
of joins in the query, and (ii) processor allocation, i.e.,
determining the number of processors for each join  obtained in (i)
so that the execution time required for the query can be minimized.
For ease of exposition, the efficiency of the join sequence, measured
by its execution on a single processor system, is termed join
sequence efficiency, and the effectiveness of processor allocation,
determined by the speedup achieved over the single processor case, is
termed processor allocation efficiency.  The overall efficiency for
dealing with the above two issues then depends closely on the two
factors.  The join sequences in which the resulting relation of an
intermediate join is only used in the next join are termed sequential
join sequences, and those without such a constraint are termed
general join sequences.  Prior approaches on executing multi-join
queries mostly focused on the use of sequential join sequences.
However, to fully exploit inter-operator  parallelism, we have to
utilize general join sequences.  For the issue of processor
allocation to join operations, the objective in the prior study of
intra-operator parallelism is to determine the processor allocation
which achieves the minimum execution time of a single two- way join.
Such a selection is referred to as operational point selection.  In
exploiting inter-operator parallelism, we are dealing with the
execution of a complex query with multiple joins where different
joins are allowed to be executed in parallel in different clusters of
processors. Thus, to minimize the execution time of a multi-join
query, in addition to operational point selection, requires more
factors, such as execution dependency and system fragmentation, to be
considered for a better processor allocation efficiency.

      We propose a two-step approach to exploit inter-operator
parallelism within a multi-join query to minimize the execution time
of the query in a multiprocessor system.  The two-step approach is
composed of:  (i) scheduling the execution sequence of multiple joins
within a query as if under a single processor environment to optimize
the join sequence efficiency, and (ii) determining the number of
processors to be allocated for the execution of each joint operation
to optimize the processor allocation efficiency.  In the first step,
we apply join sequence heuristics to determine...