Browse Prior Art Database

Detection of a Stable Hot Data Set

IP.com Disclosure Number: IPCOM000116792D
Original Publication Date: 1995-Nov-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 4 page(s) / 157K

Publishing Venue

IBM

Related People

Dan, A: AUTHOR [+3]

Abstract

Disclosed is a method to identify on-line all or a part of the hot data set from an access astream with the additional objective of minimizing the number of changes in the decision.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 37% of the total text.

Detection of a Stable Hot Data Set

      Disclosed is a method to identify on-line all or a part of the
hot data set from an access astream with the additional objective of
minimizing the number of changes in the decision.

      In many computer system environments, the information may be
organized differently at the different levels of storage hierarchy.
For example, in a database environment, the basic unit of information
at the disk level is a page and at the memory buffer level is a
record.  Similarly, in a RAID-5 environment, the parity blocks stored
at the disk may be different from the special blocks created at the
disk controller buffer level for the purpose of efficient buffering.
The higher level of memory hierarchy is generally smaller in size,
and retains only a part of the data set (referred to as a buffer).
Also, the overhead of accessing information at the higher level is
lower than that from a lower level of memory hierarchy.  Therefore,
detection and retention of hot data set is always beneficial in
reducing the access cost.  Note that if a hot granule is replaced by
another equally hot granule in the higher level of memory hierarchy,
the buffer hit probability and hence, the expected cost of
information access remains unchanged.  However, if the information at
the upper level has been updated and hence, different from that at
the lower level, the replaced data granule need to be propagated to
the lower level.  Additionally, when the information organization is
different at different level, the copy of the disk page may need to
be read first before the new copy is reconstructed.  Therefore, the
cost of replacing a hot data granule by an equally hot data granule
may incur substantially high overhead.  Such overheads can be
minimized if the hot set retained in the buffer is stable.

      The disclosed method uses additional data structures such as an
address stack (detailed later) to gather on-line stable access
statistics, and uses this information judiciously to detect a stable
hot set.  The use of secondary LRU address stack (i.e., the
additional buffer containing only the addresses of the data granules
maintained using an LRU replacement policy) was proposed in [*]  to
detect the hot set for the purpose of increasing buffer hit
probability.  However, the work did not address the issue of
minimizing the number of transitions in the buffer.  Without
additional mechanisms, such transitions will be frequent if the
buffer size is smaller than the hot set size.  The proposed algorithm
will minimize such transitions.

      The Figure shows the data structures used by the proposed
algorithm.  The LRU address stack is used to keep track of the (B+W)
most recently accessed data blocks on that disk, where B is the size
of the data buffer and W is the additional number of data blocks that
are  warm  and have the potential to become hot in the near future.
When a data block is accessed (whether found i...