Browse Prior Art Database

Efficient Cache Access Through Store Compression

IP.com Disclosure Number: IPCOM000105489D
Original Publication Date: 1993-Aug-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 6 page(s) / 163K

Publishing Venue

IBM

Related People

Beressi, D: AUTHOR [+2]

Abstract

An algorithm is disclosed that increases cache access utilization. The algorithm traps scattered store requests enroute to the cache array into a store queue interposed between the processor and the cache. The data is organized in the store queue in a way that attempts to increase the amount written to the cache array each cache access cycle. Storage consistency is maintained while performing the store compression function.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 42% of the total text.

Efficient Cache Access Through Store Compression

      An algorithm is disclosed that increases cache access
utilization.  The algorithm traps scattered store requests enroute to
the cache array into a store queue interposed between the processor
and the cache.  The data is organized in the store queue in a way
that attempts to increase the amount written to the cache array each
cache access cycle.  Storage consistency is maintained while
performing the store compression function.

      Typically, in a system architecture, a concept known as
checkpointing is utilized to facilitate error recovery.  At the end
of a checkpoint the machine is considered to be in an error-free
state.  In operations subsequent to the start of a checkpoint, an
error causes all operations to be backed out since the checkpoint was
set.  The operations can then be retried from a known "good" state.
All store attempts between 'start checkpoint' and 'end checkpoint'
must be held outside of addressable memory by the storage subsystem
in a temporary store queue.  When the checkpoint is released, the
data held in the store queue is then stored to addressable memory.  A
new checkpoint can be started before a previous checkpoint is
released.  Releasing a checkpoint applies to the oldest start
checkpoint issued.

Historically, one instruction was processed per checkpoint.
Therefore, small amounts of data were checkpointed at any one time.
The method of writing data to the cache once the checkpoint was
released was not critical to performance.  One checkpoint's worth of
data could usually be stored to the cache array in one cache access
cycle.  However, newer machines are beginning to process multiple
instructions between checkpoints.  A significant amount of data can
accumulate in the temporary store queue awaiting release of the
checkpoint.  When a checkpoint is released multiple store requests
may be required to store all the data to the cache.  How the data is
written to the cache array can impact performance.

      Cache access contention is also impacted by the number of
cycles per instruction (CPI) being reduced in newer systems.  In some
RISC machines, the CPI is less than one.  The reduction in CPI
increases the demand on the storage subsystem.  A greater number of
memory accesses is being performed per cycle.  The storage
compression algorithm discussed will attempt to reduce the number of
store accesses to the cache and, subsequently, contention to cache
access.

      The compression algorithm uses a store queue configuration
shown in Fig. 1 as the temporary store for checkpointed data.  The
width of the line buffers in the data store queue is equal to the
width of the cache array line, Y.  This allows a full line to be
written each cache access cycle.  The processor data bus width is
typically narrower than the line buffer width by a factor of 2**k, k
being the difference in bits between the processor address bus and
the cache...