Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Residence Group Recording for Multiprocessor Caches

IP.com Disclosure Number: IPCOM000105419D
Original Publication Date: 1993-Jul-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 91K

Publishing Venue

IBM

Related People

Liu, L: AUTHOR

Abstract

A technique is disclosed for recording cache residence information by cache groups, with each group covering more than one caches. This reduces the amount of broadcast invalidates with fewer bits in residence tags.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Residence Group Recording for Multiprocessor Caches

      A technique is disclosed for recording cache residence
information by cache groups, with each group covering more than one
caches.  This reduces the amount of broadcast invalidates with fewer
bits in residence tags.

      In order to manage multiprocessor cache coherence efficiently,
it is often desirable for central controller(s) to have the knowledge
of cache line residence information.  One benefit is to reduce the
amount of XI-invalidate signals when a line shared by caches is
modified.  Typical approach is to add a bit-vector tag (with as many
bits as the number of caches) to all memory lines, 2nd level cache
directory, or a central directory.  In some implementations the
amount of bits required for such residence tags may not be feasible.

In this invention I propose a method for reducing the amount of bits
for such tags, while still achieve most benefit on reducing
XI-invalidates.

      The motivation is that a data store tends to occur to a line
with very few (e.g., 1-2) copies amount the caches.  As a result, the
bits required may be reduced for residence tags by sustaining not too
many redundant XI-invalidates.  The basic ideas will be illustrated
using an example design.

      Consider a multiprocessor system with 16 processors P[0]-P[15],
each having its own private cache.  The 16 caches are divided into 4
groups G[0-3], with G[i]  covering caches of P[4i], P[4i+1], P[4i+2]
and P[4i+3].  A central controller C is responsible for managing
cache coherence.  There is a central directory D at C that records
the line residence information of all caches.  D is a typical
set-associative directory, with each entry recording the line address
identifier and other tags (e.g., validity bit).  Each entry of D has
a RES-tag for residence information.  Each RES-tag is a bit-vector of
length 4, with the i-th bit ON (0<=i<=3) indicating t possibility that
the line resides in cache(s) of G[i].  Initially D is empty when
caches are all empty.  In the illustration we assume that all caches
are maintained as subset of the lines in D, although this is not a
necessity.  The major operations are as follows:

   1.  When a cache in G[i]  fetches a new line (e.g., from main
       memory) the i-th bit of the RES-tag in D is turned ON.  (Upon
       a miss in D a new entry for the line is created in D, with
       RES-tag zero'd.

   2.  When XI-invalidates are needed (e.g., when a processor gets EX
       status on a line, or when I/O ...