Browse Prior Art Database

Hardware Engine for Automatic DRAM ECC Scrubbing

IP.com Disclosure Number: IPCOM000014209D
Original Publication Date: 2001-Jun-09
Included in the Prior Art Database: 2003-Jun-19
Document File: 2 page(s) / 42K

Publishing Venue

IBM

Abstract

As memory technologies have grown denser, the probability of single bits within the memory incorrectly changing state has increased significantly. Most ASIC chips with DRAM interfaces implement error correcting codes (ECC) that correct single bit errors and detect multibit errors. The correction that is done is in the data that is presented to the requester and not necessarily in the underlying memory. If multibit errors are detected, the subsystem is typically either reset or marked as failed at the system level. To decrease the probability that words within the DRAM memory that have single bit errors suffer a second bit flip resulting in an uncorrectable location, it is desirable to correct ("scrub") any single bit errors in the DRAM memory.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

Hardware Engine for Automatic DRAM ECC Scrubbing

    As memory technologies have grown denser, the probability of single bits within the memory incorrectly changing state has increased significantly. Most ASIC chips with DRAM interfaces implement error correcting codes (ECC) that correct single bit errors and detect multibit errors. The correction that is done is in the data that is presented to the requester and not necessarily in the underlying memory. If multibit errors are detected, the subsystem is typically either reset or marked as failed at the system level. To decrease the probability that words within the DRAM memory that have single bit errors suffer a second bit flip resulting in an uncorrectable location, it is desirable to correct ("scrub") any single bit errors in the DRAM memory.

Disclosed here is a hardware engine that automatically issues background reads to increment addresses in DRAM. When single bit errors are detected, the corrected data is written back to the memory location that had the single bit error. This method of scrubbing the memory of single bit errors has the following key advantages:
1) There is no firmware beyond hardware initialization required.
2) An access to a potentially failing location by the application is not required to scrub the location of an error. This prevents infrequently accessed locations from growing multibit errors
3) Since the hardware is issuing background accesses for the scrub, the performance penalty to scrub the memory is negligible

The invention solves the problem with a side hardware engine in the low level DRAM logic that performs background memory accesses looking for single bit errors. Upon detection of the single bit error, the hardware automatically writes the corrected data back to the underlying memory. This is an atomic read modify write operation. Only the scrub logic will automatically write back the corrected data, other functional reads will not attempt to update the memory.

The scrub engine provides a controlled method of removing accumulated single bit errors. It is better than just having hardware correct the data on any read, or raising an interrupt to code in the following ways:
1) Scrub engine controls the rate of these update operations so as to have a minimal system performance impact. Turning a read into a read correct write operation greatly reduces the DRAM bandwidth, system performance would otherwise suffer if there was a hard single bit error in one frequently used memory location.
2) Scrub engine will check all memory locations, not just the frequently used ones.
3) To provide an atomic read correct write operation, other overlapped accesses mus...