Browse Prior Art Database

Cross Partition Subquery Access In A Sequential Database System

IP.com Disclosure Number: IPCOM000239374D
Publication Date: 2014-Nov-03
Document File: 5 page(s) / 69K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for cross partition subquery access in a sequential database system. The disclosed method uses an early-out process to discard a row as soon as it is found to be FALSE.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 25% of the total text.

Page 01 of 5

Cross Partition Subquery Access In A Sequential Database System

Data in large tables is generally partitioned for improved data management. In addition, it is beneficial to partition indexes to align with data partitions for ease of index object maintenance. This construct can present significant performance challenges in a transactional database implementation, where parallel processing is generally not employed due to its high overhead.

Figure 1: Example query with subquery, and table partitioned by column YEAR, and a partitioned secondary index on columns ACCTNO and VERSION
SELECT *
FROM TABLE T1
WHERE VERSION =
(SELECT MAX(T2.VERSION)

FROM TABLE T2

WHERE T2.ACCTNO = T1.ACCTNO);

In a data table (Figure 1) with primary partitions and secondary partition indices (e.g., Data Partitioned Secondary Index (DPSI), local index), as the data is processed for a partition, each qualifying row (i.e. in a table) results in the execution of a subquery. To evaluate the subquery, each index partition must be sequentially probed to determine if the current row is the maximum version for the related column.

For query performance reasons, particularly where a table has partitions, a subquery may be de-correlated, wherein the subquery result is built once, cached, and then reused by each outer table row or used to drive a join to the outer table. Also, caching may be used for the result of the first subquery execution to be reused by each other T1 row with the same ACCTNO (Figure 1), wherein a lookup to the cache is performed for each T1 row before attempting to execute the subquery.

The method disclosed herein uses an "early-out" process to discard a row as soon as it is found to be FALSE, which tends to reduce cache size. The method immediately disqualifies the row as soon as another row is found that has a VERSION greater than the current row. This also potentially avoids the need to search all partitions to disqualify a row.

The disclosed method caches the MAX(VERSION) (e.g., for a data table containing eight (8) ACCTNO rows spread over five (5) partitions, as in Figure 2) currently found

1


Page 02 of 5

by the early-out process, as well as the partitions that have been searched to find that MAX(VERSION). While this might not be the absolute MAX(VERSION) for the key being searched, it provides enough information to determine whether the current row being compared can be discarded, or if additional partitions must be probed.

The disclosed method minimizes both the amount of duplication of partition probes and the risk of having to probe all partitions to find and cache the MAX(VERSION) for a given key multiple times.

Early-out for subquery execution

The disclosed method exploits early-out processing to minimize the number of partitions that need to be probed to validate the subquery result.

Figure 2: Method using 8 ACCTNO rows spread over 5 partitions, and the original correlated subquery

SELECT *


FROM TABLE T1
WHERE VERSION = (SELECT MAX(T2.VERSION...