Browse Prior Art Database

Memory Chip Failure-Detection Mechanism

IP.com Disclosure Number: IPCOM000119887D
Original Publication Date: 1991-Mar-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 3 page(s) / 138K

Publishing Venue

IBM

Related People

Eng, RC: AUTHOR [+3]

Abstract

This article describes a mechanism for detection of multiple bit memory failures in a computer system caused by failure of an entire memory chip.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Memory Chip Failure-Detection Mechanism

      This article describes a mechanism for detection of
multiple bit memory failures in a computer system caused by failure
of an entire memory chip.

      One approach to detection of errors in memory storage devices
has been in the area of parity testing.  Parity testing is
accomplished by counting the number of bits in a byte, and assigning
a value to a parity bit associated with that byte based on the number
of 0 or 1 bits present to create an even or odd count.  The byte
and the associated parity bit can be retested at a later time to
detect changes to the data.  Parity testing of bytes of data is well
known in the art.  The limitation associated with parity testing is
that a failure of an even number of bits will cause a later parity
test to indicate valid data even though the data is invalid.  Parity
testing is generally useful only for the detection of single-bit
errors and is not used to correct errors.

      Another approach to detection of errors is Error Checking and
Correction (ECC) logic.  Many ECC schemes have been implemented and
are well known in the art.  In general, ECC schemes involve
generating ECC codes based on multiple bytes of data.  When the data
is used at a later time, the ECC codes can be used to detect and
correct single-bit errors.  In addition, ECC can detect double-bit
errors.  However, as the number of bits in error increases, detection
of the errors becomes increasingly difficult because the logic
techniques used to generate the ECC codes become increasingly complex
and begin to break down.  As a result, existing ECC techniques start
to fail when multi-bit errors with 3 or more bits occur.

      Memory circuits in many systems used today are packaged in the
form of memory chips containing multiple bits of a byte.  Memory chip
packaging is well known in the art.  A typical memory chip may hold
bits 0-3 of an 8-bit byte, and a second memory chip would be used to
hold bits 4-7 of that same byte.  It is possible for an entire chip
to fail for numerous reasons, such as voltage or ground leads
disconnecting from the chip, internal failure of the chip, etc.
Since a chip may hold an even number of bits in a byte, chip failure
can produce an even number of invalid bits in a byte which would not
be detectable by a parity circuit.  Likewise, a 4-bit error due to a
chip failure may not be detectable by ECC logic due to the inability
of ECC logic to recognize multi-bit errors (3 or greater).  As a
result, failure of an entire chip would create a situation where both
parity checking and ECC checking could fail.

      In the drawings, Fig. 1 shows a prior-art implementation of a
multi-byte memory having multiple memory chips.  Fig. 2 shows the
mechanism disclosed herein implemented in a multi-byte memory having
multiple memory chips.

      For the purpose of this disclosure, a word is 32 bits of
information comprising a 4-byte group (b...