Browse Prior Art Database

Hybrid Checkpoint Retry Recovery Mechanism for Systems With Store- Through Caches

IP.com Disclosure Number: IPCOM000100931D
Original Publication Date: 1990-Jun-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 2 page(s) / 80K

Publishing Venue

IBM

Related People

Price, D: AUTHOR [+2]

Abstract

Even though store-through cache designs show a performance improvement over store-in designs when there is enough "storage write" bandwidth, there is a penalty associated with the use of instruction retry in terms of algorithmic complexity. The use of checkpointing reduces this complexity, but the traditional implementation is not suitable for store-through systems.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Hybrid Checkpoint Retry Recovery Mechanism for Systems With Store- Through Caches

       Even though store-through cache designs show a
performance improvement over store-in designs when there is enough
"storage write" bandwidth, there is a penalty associated with the use
of instruction retry in terms of algorithmic complexity.  The use of
checkpointing reduces this complexity, but the traditional
implementation is not suitable for store-through systems.

      The disclosed technique (see figure) avoids these difficulties
by the use of buffers and adding an independent retry mechanism for
transfer of data to memory.  Key to this method is the use of the
Translation Lookaside Buffer (TLB) for read access, keeping of
storage keys in the Directory Lookaside Table (DLAT) rather than the
cache directory, and the ability to save a checkpoint without
significant penalty.  This is because checkpoints occur more often
than in a classical checkpoint retry scheme.

      Buffers are used to hold data generated by an instruction until
all checks have been completed.  At that point the buffer is
committed to storage and the checkpoint is moved to the instruction
following the instruction with the committed data.

      The following will occur on any error:  Any blocks in the cache
which contain uncommitted data are invalidated. All uncommitted
buffers are purged.  All committed data is put away in storage.  The
processor is restored to the last checkpoint.  The processor is
restarted.  If there is an error in sending the data to storage, it
is HYBRID recovered by repeating the retry process.  This includes
storing all committed buffers as a step.  At worst, a unique error
syndrome is stored, resulting in an error pointing to the error
source the next time the location is read from storage.  Service
systems and software recovery can then be invoked.

      The keys are ...