Browse Prior Art Database

Technique to Tolerate Hard Failure of a Cache Set

IP.com Disclosure Number: IPCOM000106064D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 68K

Publishing Venue

IBM

Related People

Bowen, NS: AUTHOR [+2]

Abstract

A technique is disclosed that allows continued operation of a computing system in the presence of a hard failure of a cache set. Also, disclosed is a technique to fully utilize central storage in the presence of such a hardware failure.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Technique to Tolerate Hard Failure of a Cache Set

      A technique is disclosed that allows continued operation of a
computing system in the presence of a hard failure of a cache set.
Also, disclosed is a technique to fully utilize central storage in
the presence of such a hardware failure.

      A permanent fault in a processor cache set can render a
computer inoperable.  The disclosed technique utilizes the fact that
the bits of a page frame address overlap with bits in the cache set
number.  The overlap bits are defined as those bits that are common
to both a cache set and real page number.  The number of these bits
is defined by:

  o = left lbrace <s+i-k adjust (u 0 r 10)
  %%%%%if%s+i gt k> vabove <0 %%%%%%%%%%%%% otherwise> right ' '

Where 2 sup i is the number of bytes in a cache line, 2 sup s is the
number of sets in the cache, 2 sup k and is the number of bytes in a
page.  For the operation of this mechanism, it is necessary to
identify any page whose low order o bits are the same as the high
order o bits of the cache set with the hardware error.  Suppose cache
set Q has a permanent fault.  Then the value of the overlap bits, Q
sup *, is:

  Q sup * = left floor <Q over 2 sup (s-o)> right rfloor

What is required is to identify all the pages which map into cache
set Q.  This is the set of pages P sup * = { P | P mod 2 sup o = Q
sup * }.

      For continued operation of the computing system, the disclosed
technique insures that the cache set with the hardware error is not
used.  This is accomplished by controlling the set of real storage
frames that are accessed.  Specifically, the set of frames P sup *
are not used.  There are many frames that map to cache set Q.  The
real storage frames that may not be used are all the frames with the
low order o bits equal to the high order o bits of the inoperable
cache set Q.  The fraction of pages lost is 2 sup <minus o> which is
equal to 2 sup (k - (s + i)).

      All cache sets which have the same high order s bits as the
faulty cache set are also not used.  The fraction of sets lost is 2
sup (s-o) over 2 sup s, which is equal to 2 sup (k - (s + i)).  Note
that the same fraction of main memory and cache memory is lost.  The
perc...