Browse Prior Art Database

Method for improving system fatal FIT by correcting L1 tag parity errors

IP.com Disclosure Number: IPCOM000028869D
Publication Date: 2004-Jun-04
Document File: 3 page(s) / 43K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for improving system fatal failure in time (FIT) by correcting level 2 (L1) tag parity errors. Benefits include improved functionality and improved design flexibility.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Method for improving system fatal FIT by correcting L1 tag parity errors

Disclosed is a method for improving system fatal failure in time (FIT) by correcting level 2 (L1) tag parity errors. Benefits include improved functionality and improved design flexibility.

Background

                        Conventional processor caches are designed with parity protection for the L1 tag arrays. The single-bit errors in tagged memory can be detected but cannot be corrected. With an L1 access, if the access results in a tag parity error in the set but a real hit occurs the error is detected but does not prevent the servicing of the L1 hit. If the access is a miss in the cache and the modified exclusive shared and invalid (MESI) state of the error line is exclusive or shared, the L1 processes the request as a miss in the cache and invalidates the erroneous line.

 If the access is a miss and the MESI state of the error line is modified, the variant is fatal. The L1 cache cannot properly service the line and results in a system reset. This contributes to the system fatal FIT.

              Conventional error handling mechanisms only detect tag errors. No correction is performed from redundant available information.

General description

              The disclosed method is the correction of single-bit errors in an L1 tag array using parity protection. The method uses redundant information that is available in a cache hierarchy. The L1

MESI states are changed to include information about whether the line also exists in the L2 cache. The error handler searches the L2 cache for any potential match.

              The disclosed method also includes an ERRORMISS signal to indicate a real tag miss because a valid mismatch occurs in a portion of a tag. The signal triggers the detection of cases where the current tag lookup cannot be for the error line. As a result, the fatal reset is avoided and a correction is attempted.

Advantages

              The disclosed method provides advantages, including:

•             Improved functionality due to signaling a tag look-up miss and attempting error recovery

•             Improved functionality due to providing two cache states, E’ and M’, to denote lines present in L1 cache but absent in L2 cache

•             Improved design flexibility due to being able to be adapted for cache with single-bit error correction double-bit error detection (SECDED) tag error protection

 


Detailed description

              The disclosed method corrects many errors in L1 tag arrays. The method includes an algorithm that functions as follows (see Figure 1):

•             The L1 tag is protected using multiple (2, for example) parity bits covering different parts of the tag. If a tag mismatch occurs with a portion of a tag that does not have a parity error, the ERRORMISS signal is asserted. If it is asserted by one of the tag portions when a tag error is flagged, the current access proceeds because the error line cannot b...