Browse Prior Art Database

Using a Latin Square to Improve Move Operations

IP.com Disclosure Number: IPCOM000099331D
Original Publication Date: 1990-Jan-01
Included in the Prior Art Database: 2005-Mar-14
Document File: 2 page(s) / 73K

Publishing Venue

IBM

Related People

Emma, PG: AUTHOR [+5]

Abstract

In an ordinary cache design where the DLAT, CACHE DIRECTORY, and CACHE ARRAYS have to be accessed concurrently and the correct double word (DW) selected after the comparisons have been made (i.e., late select), the bandwidth of the cache arrays are under-utilized by a factor of four, or more generally, the bandwidth of the cache is under-utilized by the set associativity. For each DW selected from the cache, four DWs are driven out of the cache arrays. These candidates DW represent the same DW position of different lines. With a LATIN SQUARE LAYOUT of the data in the cache arrays it is possible to perform late select and to selectively harness the full bandwidth of the cache and drive out four consecutive DW; an Octword (OW).

This text was extracted from an ASCII text file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Using a

Latin Square

to Improve Move Operations

       In an ordinary cache design where the DLAT, CACHE
DIRECTORY, and CACHE ARRAYS have to be accessed concurrently and the
correct double word (DW) selected after the comparisons have been
made (i.e., late select), the bandwidth of the cache arrays are
under-utilized by a factor of four, or more generally, the bandwidth
of the cache is under-utilized by the set associativity.  For each DW
selected from the cache, four DWs are driven out of the cache arrays.
 These candidates DW represent the same DW position of different
lines.  With a LATIN SQUARE LAYOUT of the data in the cache arrays it
is possible to perform late select and to selectively harness the
full bandwidth of the cache and drive out four consecutive DW; an
Octword (OW).

      The following schematic illustrates such a layout.  The cache
is implemented on four sets of arrays:  ARR0, ARR1, ARR2, and ARR3,
that are accessed concurrently by sending an address to each array.
Designating the four members of a congruence class be designated as:
A, B, C, and D, the so-called LATIN SQUARE layout can be shown below:
    LATIN SQUARE LAYOUT
    ARR0                ARR1                ARR2 ARR3

      A(0)                 B(0)                C(0) D(0)

      D(1)                 A(1)                B(1) C(1)

      C(2)                 D(2)                A(2) B(2)

      B(3)                 C(3)                D(3) A(3) where the
number in parentheses is the DW within line modulo 4.

      Thus in response to the same address sent to all ARRAYS the DW:
 {A(x),B(x),C(x),D(x)} will be driven out to registers.  The ARRAY
which contains A(x) will depend on x. If different consecutive
addresses are sent to the ARRAYS, then an OW starting at any DW
boundary can be driven out of the arrays into registers.

      The data once capture...