Browse Prior Art Database

Implementation of Reserve Storage for Bit/Word Hard Failures in Main Memory Without Software Support

IP.com Disclosure Number: IPCOM000046982D
Original Publication Date: 1983-Sep-01
Included in the Prior Art Database: 2005-Feb-07
Document File: 3 page(s) / 58K

Publishing Venue

IBM

Related People

Ames, RN: AUTHOR [+4]

Abstract

Bit/word hard failures in main memory can be corrected without replacing the Field Replaceable Unit (FRU) and can be implemented without costly rewrite of operating system software. The feature is described in conjunction with a processor such as an IBM Series/1 processor, discussed in the manual entitled "IBM Series/1 System Summary," GA34-0035. The drawing shows the data flow and hardware necessary for implementation. The Reserve Storage Address Register and Reserve Storage Data Register can be designed to accommodate the addressing range and data width of the system. Any specific design depends on the cost trade-offs of hardware (low) to reduced service costs. Logic circuitry is utilized for online substitution of hard failures in main storage.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 3

Implementation of Reserve Storage for Bit/Word Hard Failures in Main Memory Without Software Support

Bit/word hard failures in main memory can be corrected without replacing the Field Replaceable Unit (FRU) and can be implemented without costly rewrite of operating system software. The feature is described in conjunction with a processor such as an IBM Series/1 processor, discussed in the manual entitled "IBM Series/1 System Summary," GA34-0035. The drawing shows the data flow and hardware necessary for implementation. The Reserve Storage Address Register and Reserve Storage Data Register can be designed to accommodate the addressing range and data width of the system. Any specific design depends on the cost trade-offs of hardware (low) to reduced service costs. Logic circuitry is utilized for online substitution of hard failures in main storage. A hard failure is a permanent malfunction of a cell or cells at a particular address in main storage.

Presently, on machines without Error Correction Codes (ECCs), this type of error requires the replacement of the FRU (i.e., a memory module if pluggable or an entire card if the memory modules are permanently attached). ECC machines can detect and correct this failure. However, the cost to implement ECC is very large and not justified if the soft error rate for the storage technology is negligible. Operating systems software support is not required since the substitution of the failing storage address is accomplished during power-on microdiagnostics. This reduces the cost for writing code to invoke diagnostic instructions and restarting the failing program. The figure includes circuit logic with blocks 1-8 having the functions indicated below: Block No. Function 1. Load Reserve Storage Address latch 2. Load Reserve Storage Address Logic 3.

Storage Address Register (SAR) and Active Address Key (AAK) 4. Segmentation Register (Seg Reg) 5. Address Translator 6. Address Comparator 7. Storage Data Register (SDR) 8. Reserve Storage Data Register Briefly, the circuit works as follows: All of the fitted main storage is written and then read back during processor power on microdiagnostics. The following is an outline of the sequence of events that occur when a failure is detected during the power-on microdiagnostics. 1. The main storage test (a) writes all fitted storage and (b) reads all fitted storage. 2....