Browse Prior Art Database

Fault Dispersion in Computer Memories

IP.com Disclosure Number: IPCOM000045670D
Original Publication Date: 1983-Apr-01
Included in the Prior Art Database: 2005-Feb-07
Document File: 3 page(s) / 49K

Publishing Venue

IBM

Related People

Chen, CL: AUTHOR

Abstract

Error correcting codes have been applied on computer memory systems to enhance the reliability and the data integrity of the systems.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Fault Dispersion in Computer Memories

Error correcting codes have been applied on computer memory systems to enhance the reliability and the data integrity of the systems.

Fig. 1 shows a typical arrangement of memory array chips with a (72,64) error correcting code (ECC) that corrects single errors and detects double errors. Each of the 72 chips in a particular row of the memory array contributes 1 bit to an ECC word. Thus, a chip fail causes at the most 1 error in an ECC word and is correctable by the ECC. The failures of two chips in the same row may result in an uncorrectable error tUE) condition because there are two errors in an ECC word.

When a UE occurs, the erroneous data may be recovered through certain schemes involving the writing dnd the reading of data at the same memory location. However, some strategy has to be followed to avoid accessing the same faulty chips. Otherwise, data may become miscorrected and undetected as the faulty chips accumulate. One possible strategy is to reconfigurate the memory array chips so that faulty chips do not line up to form uncorrectable errors in ECC words.

The memory array chips can be reconfigurated with exclusive OR (XOR) gates operated on the row address, as shown in Fig. 2. There are 72 control registers CR(I), I = 1,2...72, one for each column of the memory array. If a particular control register CR(I) is set at a nonzero state, the chips at column I are logically permuted. With a proper setting of all control registers at a given time, all ECC words in the memory may be error-free or contain only single errors.

There are schemes for determining the states of control registers so that the errors in the memory are all correctable by the ECC. These schemes require that the locations of the faulty chips and their faulty types are known and kept in the computer systems. The requirement of maintaining and updating the fault map of the memory is a drawback in practical applications.

In the following, a scheme for dispersing the memroy errors is described. The scheme doe...