Browse Prior Art Database

Method for improving system recovery by correcting L2 double-bit errors

IP.com Disclosure Number: IPCOM000029835D
Publication Date: 2004-Jul-14
Document File: 2 page(s) / 9K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for improving system recovery by correcting level 3 (L2) cache double-bit errors. Benefits include improved functionality and improved reliability.

This text was extracted from a Microsoft Word document.
This is the abbreviated version, containing approximately 54% of the total text.

Method for improving system recovery by correcting L2 double-bit errors

Disclosed is a method for improving system recovery by correcting level 3 (L2) cache double-bit errors. Benefits include improved functionality and improved reliability.

Background

              Conventional processor caches are designed to have single-bit error correction with double-bit error detection (SECDED) error checking and correction (ECC) protection for data arrays. This function enables single-bit errors to be corrected and double-bit errors to be detected.

              Some processors have a poison mechanism to deal with double errors in the L2 cache or memory. When a line from L2 cache or the front-side bus (FSB) is detected to have a double-bit error, the fill to the L1 cache is flagged as poison. The level 2 (L1) cache fills its cache with double-bit errors to maintain the poison information. If the poisoned data is consumed, an machine check (MCA) occurs that eventually leads to processor firmware/operating system (PAL/OS) handling and termination of the application and rebooting of the system.

              Recovering from errors by terminating the application is called system recovery. When an L2 line returning data to the L1 cache has a single-bit error, an ECC data error is signaled to the L1 cache. It prevents the fill buffer from being drained and prevents any data passing to the register file. L2 then resends the data to L1.

General description

              The disclosed method corrects double errors in the L2 cache with SECDED ECC protection.

The method is outfitted on top of an existing single-bit error recovery mechanism and does not add timing constraints.

              The disclosed method uses the redundant copies of the line with the double-bit error in L2 and memory. If the L2 read has an error state, the read is interpreted to be a miss in the L2 cache and gets its data from memory. The line currently being filled to the L1 cache is flagged as Will-Resend to signal that the line will be serviced from the FSB. The L1 manages its fill buffers appropriately to prevent any unrequired fill entry deallocation. The likelihood is very high that the line corrupted in L2 is not also corrupted in memory.

Advantages

          ...