Browse Prior Art Database

ECC Propagation to Manage One or More Set Associative Cache Directories

IP.com Disclosure Number: IPCOM000031123D
Original Publication Date: 2004-Sep-13
Included in the Prior Art Database: 2004-Sep-13
Document File: 4 page(s) / 61K

Publishing Venue

IBM

Abstract

A method for handling cache directory ECC is described. This method enables the reduction in chip area, better utilization of directory bandwidth and the possibility of combining multiple directories under one ECC scheme.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 42% of the total text.

Page 1 of 4

ECC Propagation to Manage One or More Set Associative Cache Directories

Typical set associative cache directory designs utilize one of the following methods for maintaining ECC to protect the directory contents:
1) Calculate (and store) separate ECC per associativity class
2) Read and buffer a full congruence set (all associativity classes) before updating the contents, calculating ECC, and writing the cache directory
3) Perform a cache directory read-modify-write operation for every update Method (1) has an adverse affect on chip area and/or cache directory size (width). Maintaining ECC for every associativity class requires several ECC bits to be stored with each entry. Due to the exponential qualities of ECC, this implementation requires a much larger cache directory structure (SRAM/eDRAM) than is required when ECC is calculated over an entire congruence set. As an example, using separate ECC per class, an 8-way set associative cache directory would require at least 48 ECC bits to protect the information. In comparison, only nine ECC bits would be required to protect the entire congruence set.

     Method (2) also has a negative impact on chip area. Buffering an entire congruence set before calculating ECC requires a significant area increase due to the large number of latches (or an additional array structure) required to temporarily store a full congruence set. Using incremental ECC and propagation, buffering is only necessary for a single associativity class.

     Method (3) consumes more directory bandwidth than may be desirable. First, a read (snoop) is performed to interrogate the initial cache directory information. Since no buffering is performed, any update must first re-read the directory. Following this, the update is applied, new ECC is calculated, and the cache directory contents are written back. At a minimum, this procedure requires three cache directory accesses (two reads and one write).

     This invention's principals may also be extended to allow combining multiple cache directories while protecting the data using only a single ECC. For example, in a cache-coherent NUMA (CC-NUMA) system, the entirety of system memory is the summation of each node's local memory. In a distributed implementation, each node has its own set of cache directories. One directory keeps track of system memory cached by processors on the local node (call this a local directory or LDir). In addition, a second directory keeps track of local memory cached by the processors (or node controller) on the other (remote) nodes (call this a remote directory or RDir). Using incremental ECC and propagation, both the LDir and RDir may be combined into a single cache directory structure. This has the benefit of reducing the cache directory size (width) because only a single ECC is required.

     This invention expands/improves upon method (2) described previously. Rather than buffering a full congruence set, only a single associativity class along with the old directory...