Browse Prior Art Database

Increasing L2 Bandwidth for Store Operations

IP.com Disclosure Number: IPCOM000105061D
Original Publication Date: 1993-Jun-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 102K

Publishing Venue

IBM

Related People

Ignatowski, M: AUTHOR [+3]

Abstract

For an L2 which supports multiple L1 caches that "store through" the directory impact of the L2 for store traffic is burdensome. A means of increasing the directory bandwidth and promote concurrent store updates is disclosed.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Increasing L2 Bandwidth for Store Operations

      For an L2 which supports multiple L1 caches that "store
through" the directory impact of the L2 for store traffic is
burdensome.  A means of increasing the directory bandwidth and
promote concurrent store updates is disclosed.

      Assume that the memory hierarchy is comprised of L1-CACHES that
are managed WTWAX.  A WTWAX cache management protocol is defined as:

o   all stores are written through the L1 cache to the L2 (WT),
o   all lines that are stored into by the processors must we
    allocated (WA - WRITE ALLOCATE), and
o   all lines written into must be held exclusively (X).

      In such caches the DW store rate is .33 STORES/INSTRUCTION and
the aggregate store rate for 16 processors, attached to a single L2,
can easily exceed  2 DW-STORES/CYCLE.

      The stores from each processor are maintained with STORE STACKS
and Misses from the processor are given priority so that the missing
processor can resume as early as possible.  If items in the STORE
STACK relate to a miss, a condition that can only occur if the L2
indicates that a line was or is held with exclusive status and has
not purged its potential updates, the relevant updates from the STORE
STACK are combined with the information within the L2 and the L2 is
updated.  These updates can only be in one processor's STORE STACK.

      The miss traffic to the L2 can be estimated by aggregating the
individual uniprocessor L1 miss rates which depend on L1 cachesize.
Thus with a L1-MISS every 25 instructions, the  miss rate for 16
processors attached to a single L2 could approximate .25
MISSES/CYCLE.  If STORES and MISSES are handled on an individual
basis they both make a single L2 DIRECTORY access.  The trick is to
increase the bandwidth to the L2-DIRECTORY so that the STORE traffic
can be handled on a parallel basis.

      INCREASING L2 BANDWIDTH FOR STORE OPERATIONS - The manner of
increasing the L2-D IRECTORY BANDWIDTH and L2-ARRAY is to used a DUAL
DIRECTORY for the L2.  In a cache system that provide dual
directories, we have:

o   L2-Access Directory

          The Access Directory is a direct mapped directory that
    converts the access from the processor into a position in the
    cache to allow parallel access.  The size of the L2-Access

    Directory in cache lines can be much larger than of the size of
    the cache arrays in lines.

o   L2-Contents Directory

          The Contents Directory is a set associative directory that
    is used to make replacement decisions and each entry points to a
    line that is actually present in the cache arrays.  The
    L2-Contents Directory contains the real address tags of the lines
    in...