Browse Prior Art Database

Sub-Block Cache Paging in Shared Memory Multiprocessors

IP.com Disclosure Number: IPCOM000111684D
Original Publication Date: 1994-Mar-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 104K

Publishing Venue

IBM

Related People

Eberhard, RJ: AUTHOR [+2]

Abstract

Cache block validity is maintained at a finer granularity within cache blocks, for sub-blocks. This has the advantage of creating variably-sized cache blocks for different classes of memory operations, non-sequential memory accesses or sequential memory accesses, while enabling different prefetching policies to be easily employed. Additionally, it reduces the penalty associated with refetching shared blocks invalidated by an alternate processor store access and store-fetch interlock delays associated with shared memory locations. In conventional multiprocessor systems, maintaining storage consistency between the cache buffers of the processors results in performance losses.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Sub-Block Cache Paging in Shared Memory Multiprocessors

      Cache block validity is maintained at a finer granularity
within cache blocks, for sub-blocks.  This has the advantage of
creating variably-sized cache blocks for different classes of memory
operations, non-sequential memory accesses or sequential memory
accesses, while enabling different prefetching policies to be easily
employed.  Additionally, it reduces the penalty associated with
refetching shared blocks invalidated by an alternate processor store
access and store-fetch interlock delays associated with shared memory
locations.  In conventional multiprocessor systems, maintaining
storage consistency between the cache buffers of the processors
results in performance losses.  Storage is generally organized as a
hierarchy, with the larger main storage at the base and one or more
levels of cache buffer memories above that, each consisting of a
dynamically changing subset of the data in main storage.  Cache
buffers are organized at higher speed, but smaller memories which
reside closer to the instruction execution units, enabling more rapid
access to main storage data.  The caches are generally managed in
units called cache blocks, with the capacity ranging from tens to
thousands of cache blocks.  A cache block is nothing more than a set
of contiguous bytes of data in main storage, managed as a logical
unit.  In this model, it is assumed that stores cannot be broadcast
to all caches for parallel update, an assumption with a high degree
of validity as the degree of multiprocessing increases.  The caching
model therefore assumes that cross-invalidation of cache copies for
store accesses is required.

      This scheme differs from one which simply reduces the cache
block size to the size of the sub-block in the organization of the
cache directory and the caching control structures.  Typically, each
cache must maintain a directory entry identifying each cache block
contained within itself.  These entries generally consist of the
upper portion of the address, less the address bits addressing bytes
within the block identified.  As an example, in a 32-bit machine, the
cache block size may be 64 bytes.  This requires 6 bits of address
for byte identification, leaving 26 bits per directory entry or tag.
If the cache sub-block size is set at 16 bytes, an additional four
presence bits are required to indicate the availability of each of
the 16-byte sub-blocks within the directory entry.  Just reducing the
block size to 16 bytes requires 4x28=112 bits of information within
the cache directory.  Cache blocks are typically larger than the
amount of data actually accessed by any...