Browse Prior Art Database

Effective Approach to Query I/O Parallelism Using Sequential Prefetch and Horizontal Data Partitions

IP.com Disclosure Number: IPCOM000105864D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 4 page(s) / 101K

Publishing Venue

IBM

Related People

Liu, TS: AUTHOR [+4]

Abstract

When the CPU cost and I/O cost for a query can be estimated by a relational database system, this approach can be used to improve the elapsed time of I/O-bound queries by enabling multiple parallel I/O streams on disjoint horizontal data partitions of a table, with each I/O stream being handled asynchronously by a Sequential Prefetch engine. Using this technique, the database management system can maximize its I/O bandwidth to the data in executing a query, while minimizing the required system resources.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Effective Approach to Query I/O Parallelism Using Sequential Prefetch and Horizontal Data Partitions

      When the CPU cost and I/O cost for a query can be estimated by
a relational database system, this approach can be used to improve
the elapsed time of I/O-bound queries by enabling multiple parallel
I/O streams on disjoint horizontal data partitions of a table, with
each I/O stream being handled asynchronously by a Sequential Prefetch
engine.  Using this technique, the database management system can
maximize its I/O bandwidth to the data in executing a query, while
minimizing the required system resources.

      This disclosure proposes an approach to provide I/O Parallelism
support in a RDBMS to improve elapsed time of I/O-bound queries.  It
fully utilizes the I/O bandwidth provided by a partitioned table, and
exploits the Sequential Prefetch capability.

      Instead of using one database I/O stream to access a table,
this disclosure proposes diverging this I/O stream into multiple
parallel I/O streams, with each I/O stream accessing a disjoint
horizontal data partition of the table.  The I/O streams are
triggered in a round-robin fashion to fetch data from all partitions
concurrently.  Data is fetched asynchronously via the sequential
prefetch mechanism, thus allowing I/O to different data partitions to
overlap.

      The following steps can be used to apply I/O parallelism to an
I/O bound query:

o   Determine the parallel groups within the query.  A parallel group
    is a series of relational operations that ends with data
    materialization.

o     For each parallel group:

    1.  Estimate the qualified table partitions for a parallel group
        based on predicate selectivity.

    2.  Determine the degree of parallelism by

        a.  Estimating the best possible elapsed time for a parallel
            group using the formula:

                              Best Possible Elapsed Time =
                                Maximum (CPU Elapsed Time, Maximum
            Partition I/O
                                Elapsed Time)

        b.  Deriving the logical partitions for a parallel group
            using Key Partitioning or Page Partitioning.  Each
            logical partition will c...