Browse Prior Art Database

Cache Directory Lookup With Partial Address

IP.com Disclosure Number: IPCOM000122676D
Original Publication Date: 1991-Dec-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 3 page(s) / 150K

Publishing Venue

IBM

Related People

Liu, L: AUTHOR

Abstract

Disclosed is a technique for utilizing partial address compares for efficient implementation of cache data selection from arrays. A key technique is to maintain distinct partial addresses for cache lines within identical congruence classes.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 42% of the total text.

Cache Directory Lookup With Partial Address

      Disclosed is a technique for utilizing partial address
compares for efficient implementation of cache data selection from
arrays.  A key technique is to maintain distinct partial addresses
for cache lines within identical congruence classes.

      In most high performance computers cache access paths have been
critical in determining machine cycle times.  In typical designs the
access of cache data is contingent upon the results of directory
lookups.  For instance, in a set-associative cache design, the data
read out of the arrays can be sent to the requesting unit only after
late- select based on directory compares.  The timeliness of
late-select is highly dependent on how close the directory
(directories) are from the arrays physically.  Ideally the
late-select timing can be optimized if the directory (directories)
can be imbedded in the array chips (or in close areas).  Such an
approach is often difficult to realize due to the amount of silicon
required for directories, especially when the number of address bits
increases (e.g., to 48-64) in future architectures.  We observe that
very high accuracy of late-select may be achieved based on directory
compares with partial address bits.  Hence, efficient data access to
arrays may be achieved by employing a partial address directory
physically close to the arrays.

      The basic idea of the invention will be illustrated with an
exemplary cache design (see the figure) for a 64-bit virtual
addressing architecture.  The cache is 256K, with 128 bytes/lines,
512 congruence classes, and 4-way set-associativity.  The cache is
virtual address based, with a virtual address directory D.  For each
virtual address A, let A[i,j] indicate the partial address bits
between (and including) i-th and j-th bit positions (O&i&j&63).  For
each virtual address A, A[57,63] is the offset within the line,
A[48,56] is the congruence class index.  At each entry of D the
address field records bits 0-47, and hence a total of 96K bits are
required for all the addresses in D.

      In addition to the conventional virtual directory D, a separate
partial address directory D1 is considered.  D1 also has 512
congruence classes with 4-way set-associativity.  For each line (with
virtual address A) in the cache, the associated D1 entry only records
8 address bits A[40,47] (vs. 48 bits in D).  The total number of
address bits used in D1 is much smaller (16K).  D1 is allocated
physically close to the arrays.

      Consider a doubleword fetch access from the CPU with virtual
address A.  Bits A[48,56] are used to select the congruence class for
access as usual.  In the arrays 4 data units (e.g., doublewords) are
read out of the 4 lines in the congruence class for late-select as
usual.  Directory lookup is conducted in both D and D1:
$    D is searched with A[0,56] as usual, which determines cache hit
or miss precisely.  A cache miss will ...