Browse Prior Art Database

Light Access Plan Validation

IP.com Disclosure Number: IPCOM000031126D
Original Publication Date: 2004-Sep-13
Included in the Prior Art Database: 2004-Sep-13
Document File: 3 page(s) / 71K

Publishing Venue

IBM

Abstract

When a database query optimizer or database statistics engine attempts to determine whether a saved query access plan can safely be reused, it may probe a variety of statistical sources including indexes, frequent value lists, histograms, and others. This probing may be system resource intensive. Described are several methods to determine that this probing is unnecessary in certain environments. Thereby, system resources are not consumed doing otherwise unnecessary probes.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 32% of the total text.

Page 1 of 3

Light Access Plan Validation

Disclosed is a program for a database statistics engine and/or database query optimizer (note that the terms statistics engine and query optimizer are used interchangeably throughout the disclosure) to detect a stable database environment, and therefore, skip the probing of various statistic sources on query Access Plan (AP) revalidation or reuse. Thus, much of the full open overhead of revalidating the access plan is avoided. Such an invention is important since query performance generally and full open time specifically is always a concern with customers.

     When a query is optimized, the query Access Plan (AP) is saved away in an AP cache so that it can be reused later. However, an AP can not be indiscriminately reused due to potential changes in either the underlying table's data, or changes in the selectivity of host variables plugged into the query, or changes to the database configuration. If the AP was indiscriminately reused, inevitably performance problems would follow. Thus, prior art has established a means for quickly verifying whether the previously used AP can be safely reused given the current hostvariable(s) and state of the underlying data.

     For example, given a query as below in example 1, the optimizer may take the current value of :HV and if there is an index over Column1, it may ask the statistics engine to probe the index to estimate how many rows will be returned. If the estimate is vastly different than the estimate from when the query was first optimized, the AP may be reoptimized on the chance that there may be a more optimally performing access plan available. As another example, it may look at the cardinality of Column2 from a Column Statistic to see if it has changed drastically compared to when the query was first optimized. If none of the estimates are drastically different, the AP can be safely reused.

Example 1:

Select Column2, count(*) from MyTable where Column1 = :HV group by Column2 order by count(*)

     This verification of an access plan, however, can still be substantial overhead as many different objects (indexes, column statistics, constraints, internal access plan objects, etc.) must be examined and probed to determine if the AP is OK to reuse. This time is attributed to a query's "full open" time. The problem is that this rechecking overhead on every AP reuse is often unnecessary since the data and data distributions may be relatively stable for much of the time.

     The query optimizer verifies whether an AP can be safely reused by asking the statistics engine various questions about the query. The optimizer may ask getFilterFactor/analyzeFilterFactor or getCardinality/analyzeCardinality questions over various query predicates. The difference between the get and analyze methods is that the get may return a previously cached value, whereas the analyze drills down and probes a statistics source for a current value. When attempting to verify that an AP is safely reusable,...