Browse Prior Art Database

Fault Tolerant Design Showing Levels of Availability

IP.com Disclosure Number: IPCOM000103438D
Original Publication Date: 1990-Nov-01
Included in the Prior Art Database: 2005-Mar-17
Document File: 1 page(s) / 48K

Publishing Venue

IBM

Related People

Garofalo, FJ, Jr: AUTHOR [+2]

Abstract

High availability machines are designed to minimize the effect of a failing component on their operation and service. They are designed with both redundant components and with a multiplicity of components doing the same general function. Failures in these components will result in the machine being in various states of operation. A failure in a redundant component should not effect the operation, while other failures may cause only small loss of function.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 75% of the total text.

Fault Tolerant Design Showing Levels of Availability

      High availability machines are designed to minimize the effect
of a failing component on their operation and service.  They are
designed with both redundant components and with a multiplicity of
components doing the same general function.  Failures in these
components will result in the machine being in various states of
operation.  A failure in a redundant component should not effect the
operation, while other failures may cause only small loss of
function.

      Levels of operation are defined to match these conditions.  The
number of levels would be dependent on the product design.  An
example of five levels of operation is given below:
(1)  fully functioning;
(2)  fully functioning with redundant component down (this allows
deferred maintenance);
(3)  functional in a slightly degraded mode (this allows deferred
maintenance, e.g. a failure affecting one of many links);
(4)  major degradation (e.g. half the functions failed to operate)
must be repaired soon, but user can still get use out of the product;
and
(5)  the machine is down; all function is lost.

      The states 1, 2, 3 and 4 are detected by hardware indications
which are examined by microcode and reported.  A failure in a
redundant component allows another one to be set up as a substitute.
The machine then reports that it is in state 2 and can indicate the
failure.

      To simplify the implementation of this approach:
(1) ...