Browse Prior Art Database

DB2 Cost Formula

IP.com Disclosure Number: IPCOM000108366D
Original Publication Date: 1992-May-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 6 page(s) / 209K

Publishing Venue

IBM

Related People

Shibamiya, A: AUTHOR [+2]

Abstract

This article describes the cost formula derived from the numerous DB2* measurements to identify major performance-sensitive variables. This cost formula is used by the DB2 V2 optimizer in selecting the most efficient access path for a given SQL call and corresponding data.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 26% of the total text.

DB2 Cost Formula

       This article describes the cost formula derived from the
numerous DB2* measurements to identify major performance-sensitive
variables.  This cost formula is used by the DB2 V2 optimizer in
selecting the most efficient access path for a given SQL call and
corresponding data.

      Prior to Version 2 of DB2, the CPU cost formula in the DB2
optimizer was based simply on the expected number of rows qualifying
for the search arguments provided.  With this method, how many rows
are searched to get each qualifying row is not considered, leading to
some gross errors.  It also ignores performance sensitivities to the
number of columns fetched, sorted, or updated, the number of host
variables used, the number of search arguments which must be
evaluated, etc.

      One of the biggest challenges in dealing with the Relational
Data Base Management Systems (RDBMS) is performance.  RDBMS offers a
tremendous improvement in programming user productivity.  However,
unlike the traditional data base management systems, in which a
method to access data is predetermined at the time of data base
creation, RDBMS, such as DB2, dynamically attempt to select the most
efficient access path based on various statistics available at the
time of execution.  Such statistics can include information on tables
and indexes, such as the number of records and number of pages in a
table and the number of distinct index key values, the buffers
available, and the format of the SQL calls.  A relatively simple SQL
call joining 3 tables, each of which has 3 indexes, can have
thousands of different access paths possible.  (An access path
defines how the records in a table are to be retrieved; i.e., which
tables are accessed in what sequence, what indexes are to be used for
each table, etc.).

      DB2 optimizer relies on the cost formula, discussed in this
article, to evaluate each of the promising access path candidates to
select the one considered to be the most efficient in terms of both
CPU and I/O resource usage.

      In addition to providing the significantly improved cost
formula for the DB2 optimizer, a superset of this cost formula is
included in ANDB2, an analytical model of DB2, which is a Hone-Aid
tool used by IBM SEs to estimate the cost of various SQL calls in
terms of CPU and I/O resource usage to assist in capacity planning.
The cost formula in the DB2 optimizer is intended to help in
selecting the most efficient access path.  Therefore, that portion of
the cost independent of a particular access path chosen, e.g., the
cost of fetching rows back to DB2 COST FORMULA  -  Continued
application program, is not included in the DB2 optimizer, but is in
cluded in ANDB2 - hence a "superset".  Based on the invention, ANDB2
for DB2 V2R2 has been made available to the field.
Notations
NDMS= number of rows scanned by Data Manager, the lower-level (first)
component to process records
NDMC= number of rows participating in c...