Browse Prior Art Database

Diagnostics Which Check Hardware ECC Correction/Detection

IP.com Disclosure Number: IPCOM000120873D
Original Publication Date: 1991-Jun-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 9 page(s) / 435K

Publishing Venue

IBM

Related People

Hanna, SD: AUTHOR

Abstract

Disclosed is a procedure that allows microcode to test a hardware ECC algorithm using microcode in a microprocessor having no provision for testing the algorithm if the microprocessor includes the capability to replace any failing module (4 bits wide) in the memory with a spare memory module. Sparing also enables the testing.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 35% of the total text.

Diagnostics Which Check Hardware ECC Correction/Detection

      Disclosed is a procedure that allows microcode to test a
hardware ECC algorithm using microcode in a microprocessor having no
provision for testing the algorithm if the microprocessor includes
the capability to replace any failing module (4 bits wide) in the
memory with a spare memory module.  Sparing also enables the testing.

      The sequence of operations used to test the embedded ECC
algorithm forces all combinations of single bit errors and many of
the possible combinations of double bit errors.

      The ECC algorithm is assumed to operate on a 32-bit data bus
with seven additional ECC check bits.  Such an algorithm provides for
single bit error correction, i.e., any single bit error in any of the
39 bits can be corrected automatically.  This algorithm also can
detect but not correct any double bit error anywhere in the 39 bits.
In addition, this algorithm detects 'module kills'.  The memory is
arranged as ten modules each storing and retrieving four bits of
information, either data or a combination of data and ECC bits.  A
module kill is any memory module which does not work at all and can,
therefore, yield 0, 1, 2, 3, or 4 bit errors on the bus. This
algorithm can detect not only double bit errors, but also any three
or four bit errors if the errors all occur in one module.

      This diagnostic test uses the 'nibble swapping' capability to
force various types of 1, 2, 3, and 4 bit errors within any one
module.

      It is possible to force single and multiple bit ECC errors
using the sparing logic.  The technique used is shown below.  The
data used in this test is shown in Tables 5 and 6.  The test consists
of multiple DMA operations shown in Table 1.  The asterisks in Tables
5 and 6 indicate that, in the first spare nibble, extra data bits are
used to put the ECC bits in the correct state to cause only single
bit errors.  (The addresses and data in the tables are given in
hexadecimal.)

      A reduced version of this test is shown in Table 2. The test
shown in Table 2 is the test for a single bit error in data bit
00. The test in Table 2 is the first test shown in Table 4.

      The test shown in Table 3 is the test for a double bit error
in data bits 00 and 01.  The test in Table 3 is the first test
shown in Table 5.

      Table 4 shows the tests for single bit errors.  Tables 5 and 6
show the tests for multiple bit errors within each nibble.

      Detected multiple bit errors supply an error indication.
Therefore, a test for multiple bit errors requires monitoring the
error since the data read with active error correcting will be the
precorrected data for a detected multiple bit error, an undetected
multiple bit error, or no error.

      Undetected multiple bit errors can only occur for even numbered
bit errors of four or more that include bits in two or more nibbles.
Therefore, the sparing logic, which af...