Browse Prior Art Database

Integrated Buffer Management And Query Optimization Strategy For Relational Databases

IP.com Disclosure Number: IPCOM000100644D
Original Publication Date: 1990-May-01
Included in the Prior Art Database: 2005-Mar-15
Document File: 5 page(s) / 172K

Publishing Venue

IBM

Related People

Cornell, D: AUTHOR [+2]

Abstract

An integrated buffer management and query optimization strategy is described so that the access plans for all query types are considered together based on buffer availability. An integer programming formulation is used to select the best access plan and associated buffer allocation. The objective function is similar to the cost function used in the conventional query optimizer but is the sum over all query types weighted by query frequencies. The constraints are that not only the buffer allocation for each query, but also the time average buffer requirement over all queries, must be less than the total buffer size. Note that the time average buffer requirement for each query type is proportional to its execution time and arrival frequencies in addition to the buffer allocation for each instance of its execution.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 34% of the total text.

Integrated Buffer Management And Query Optimization Strategy For Relational Databases

       An integrated buffer management and query optimization
strategy is described so that the access plans for all query types
are considered together based on buffer availability. An integer
programming formulation is used to select the best access plan and
associated buffer allocation.  The objective function is similar to
the cost function used in the conventional query optimizer but is the
sum over all query types weighted by query frequencies.  The
constraints are that not only the buffer allocation for each query,
but also the time average buffer requirement over all queries, must
be less than the total buffer size.  Note that the time average
buffer requirement for each query type is proportional to its
execution time and arrival frequencies in addition to the buffer
allocation for each instance of its execution.  The execution time
certainly depends upon access plan selection and buffer allocation.
To get around this interdependency problem, we decompose the problem
into two parts and take an iterative approach.  The first part is the
optimization just described with an assumed response time for each
query type, and the second part is a queueing problem to solve for
the response time based on the access plan selections and buffer
allocation from the optimization problem.  The optimization problem
then uses the response time from the queueing model to solve for an
improved solution.  If we choose a large response time for each query
type to start the optimization problem, a very  conservative buffer
allocation will be made, as exaggerated response time leads to
overestimates on  the time average buffer requirement.  The queueing
analysis then provides an improved response time based on the
strategy recommended from the optimization problem.  With the
improved response time, more buffer space will be allocated and lead
to further improvement in the response time.  It appears to converge
after a few iterations.

      Consider a set of join queries Qi, i=1,...,NQ, and relations
Rk, k=1, ...,NR.  For each query type Qi, there is a set of different
join plans, QSij, j= 1,...,ni.  Here, the integrated strategy for
each join plan specifically indicates whether a joining relation is
to be kept in the buffer.  For example, in a two-way join of Rk and
Rj, nested loop join with Rk (or Rj) in buffer as inner relation and
nested loop join with Rk (or Rj) as inner relation but not in buffer
are four potential join plans.  Let Xij be 1 if plan QSij is adopted
for Qi and 0 otherwise.  That is to say each join plan is identified
by a (0,1) integer programming variable to indicate whether it is
adopted or not.  Let Pij be the number of pages to be read from disk
if plan QSij is adopted, Sijm be the number of times Rm is scanned
under QSij and Aijm be the number of disk accesses per scan of Rm
under QSij. Define gi to be the arrival freque...