Browse Prior Art Database

Unique Method for Reporting Errors Detected/Corrected by ECC Circuitry

IP.com Disclosure Number: IPCOM000105872D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 79K

Publishing Venue

IBM

Related People

Hunter, SW: AUTHOR [+2]

Abstract

This disclosure provides a method for error information from ECC hardware logic to be collected at a central location by using the message-driven concepts that are common in today's operating systems. The uniqueness of this method is the ability for the ECC hardware logic to enqueue a message containing error information to anywhere within the system. The only requirements are for the hardware logic to have access to data paths used for message passing and for the hardware logic to generate messages compatible with system message formats.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Unique Method for Reporting Errors Detected/Corrected by ECC Circuitry

      This disclosure provides a method for error information from
ECC hardware logic to be collected at a central location by using the
message-driven concepts that are common in today's operating systems.
The uniqueness of this method is the ability for the ECC hardware
logic to enqueue a message containing error information to anywhere
within the system.  The only requirements are for the hardware logic
to have access to data paths used for message passing and for the
hardware logic to generate messages compatible with system message
formats.

      When data becomes corrupted by one or more bits, there are
various hardware methods available to recover the original data.  One
such method is the use of Error Correction Codes (ECCs).  ECC uses
extra bits (check bits) to provide redundant information on the data
stored, so that when one or more bits are corrupted, the original
data can be recovered.  This recovery process is accomplished by
generating a syndrome from the corrupted data which is used to detect
or correct the error.  This syndrome has a unique value for each
error being corrected.  Normally, if a single bit error occurs, the
hardware will use the syndrome to correct the corrupted bit and mask
(or hide) its occurrence without recording any information about the
cause of the error.

      However, it is sometimes desirable to collect specific
information about these memory errors, such as the syndrome, so that
statistics can be kept for individual memory components.  Statistics
of this type provide a method for detecting an excessive number of
single bit errors for a memory component that potentially lowers the
overall reliability of the memory and thus the availability of a
product.

      To add the capability for saving the in...