Browse Prior Art Database

Double Word Change Recording

IP.com Disclosure Number: IPCOM000107994D
Original Publication Date: 1992-Apr-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 121K

Publishing Venue

IBM

Related People

Emma, PG: AUTHOR [+5]

Abstract

A method is set forth for reducing the rate of uncorrectable errors in a "store in" cache. The observed characteristics of soft errors in caches indicate that the probability of occurrence of these errors is independent of time. Among others, errors caused by alpha particles or cosmic rays would exhibit this characteristic. The following describes an improved method for recording changes to the data in the cache which reduces the vulnerability of the cache of unrecoverable errors.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Double Word Change Recording

       A method is set forth for reducing the rate of
uncorrectable errors in a "store in" cache.  The observed
characteristics of soft errors in caches indicate that the
probability of occurrence of these errors is independent of time.
Among others, errors caused by alpha particles or cosmic rays would
exhibit this characteristic.  The following describes an improved
method for recording changes to the data in the cache which reduces
the vulnerability of the cache of unrecoverable errors.

      In a store in cache with error detection (e.g., parity) but not
error correction, and the usual replacement and change recording, any
error in a line that contains changed data is unrecoverable.  Since
the cache manages lines as units, changed lines are vulnerable to
error in two ways; errors that occur after the line has been used are
detected at castout time, and errors to any portion of the line that
is never used affect the entire line.  Studies indicate that (on
average) fewer than 6 of 16 double words in a 128-byte cache line are
modified in a changed line.  Recording the changed status of a line
on a double word rather than a whole line basis greatly reduces the
vulnerability of the cache to unrecoverable errors for two reasons.
First, any errors in double words that were never changed would not
cause unrecoverable errors since the correct data could be restored
from main memory. Second, any errors that occurred in double words
before a particular double word was changed could not cause an error,
if all data transmission between the processor and the cache use
complete double words.  With the usual store-in replacement policy,
the second effect is small, and the benefit of double-word change
recording would be to reduce the average vulnerability of data in the
cache to unrecoverable errors by a factor of approximately 3.

      However, combined with a modified store-in replacement
strategy, for example, early castout (ECO) (*), cache lines that are
resident in the cache for long periods of time and are changed many
times would be less likely to suffer unrecoverable errors.  With ECO,
changed lines are written back to main memory whenever they leave
most recently used (MRU) replacement status.  With double word change
recording and ECO, the only parts of the line that are vulnerable are
those that are changed while the line is MRU and the vulnerability is
limited to the time the line is MRU.  The net effect of both ECO and
double-word change recording would be a reduction of a factor of more
than 10 in the average susceptibility of changed data in the cache,
with a smaller reduction (by a factor of about 5) in the
vulnerability of heavily used system data.

      The cost of recording changes on a double-word granularity is
high. For example, in the 308 1KX cache, each congruence class (with
four cache lines) requires 4 X 15 bits for addres...