Browse Prior Art Database

Method and Apparatus for Signalling Hot Failover in a Redundant I/O Subsystem

IP.com Disclosure Number: IPCOM000014532D
Original Publication Date: 2001-Jun-09
Included in the Prior Art Database: 2003-Jun-19
Document File: 6 page(s) / 75K

Publishing Venue

IBM

Abstract

Disclosed is a mechanism for one adapter (dubbed the "Master") to disable a peer adapter (dubbed the "Slave"), and then restart the Slave adapter at a later time. The Slave adapter is held in stasis (i.e. the reset state) while the Master adapter performs any needed critical operations. The Master adapter is in complete control of the process and decides when the Slave adapter is to be disabled, how long to hold it in stasis, and when to restart it. When the Slave adapter is restarted it goes back through its initial bringup/boot phase again. The mechanism is accomplished with a direct connection between the adapters. No host operating system support is needed to support this function, resulting in a simpler implementation. The two adapters may even reside in separate enclosures, under different power domains, or even in separate systems. The adapter to be disabled is identified easily because it is implicit in how the system is cabled. This mechanism also is capable of meeting stringent time/performance constraints because the direct connection between adapters allows one adapter to be disabled immediately when required. An aggregation of the functional pieces described below forms a mechanism for controlling access to a shared resource by adapters. The mechanism provides the following functions for the peer adapters: The Master adapter can disable (aka "Fence") the Slave adapter.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 27% of the total text.

Page 1 of 6

Method and Apparatus for Signalling Hot Failover in a Redundant I/O Subsystem

   Disclosed is a mechanism for one adapter (dubbed the "Master") to disable a peer adapter (dubbed the "Slave"), and then restart the Slave adapter at a later time. The Slave adapter is held in stasis (i.e. the reset state) while the Master adapter performs any needed critical operations. The Master adapter is in complete control of the process and decides when the Slave adapter is to be disabled, how long to hold it in stasis, and when to restart it. When the Slave adapter is restarted it goes back through its initial bringup/boot phase again.

The mechanism is accomplished with a direct connection between the adapters. No host operating system support is needed to support this function, resulting in a simpler implementation. The two adapters may even reside in separate enclosures, under different power domains, or even in separate systems. The adapter to be disabled is identified easily because it is implicit in how the system is cabled. This mechanism also is capable of meeting stringent time/performance constraints because the direct connection between adapters allows one adapter to be disabled immediately when required.

An aggregation of the functional pieces described below forms a mechanism for controlling access to a shared resource by adapters. The mechanism provides the following functions for the peer adapters:

The Master adapter can disable (aka "Fence") the Slave adapter.

The Master adapter can hold the Slave adapter in stasis as long as needed. While

the Slave adapter is in stasis, the Master adapter may perform any critical operations needed to shared resources. The Master adapter restarts (aka Unfence) the Slave adapter in its initial bringup or

reset condition. Either adapter can run without requiring the presence of the other adapter. This

allows either adapter to be concurrently maintained or replaced, and to go through power-on cycles and resets without impacting the other adapter. Configurations are also not limited to the single pair of adapters with one Master and

one Slave. It is possible for one adapter to be Master for a number of Slave adapters, for one adapter to be Slave to a number of Master Adapters, or for an adapter to contain both the Master and Slave functions. This is accomplished by using multiple instances of the functional pieces described below.

The functional pieces are described briefly here as an overview, and individually elaborated upon below. Figure 1 shows the relationship between the pieces. The pieces are:

Fence-out Logic - Contained on the Master IOA, it initiates and controls the Fence

action. Fence Connection - Communication pathway between the Master IOA and Slave

IOA, it notifies the Slave IOA when a Fence is to occur. Fence-in Logic - Contained on the Slave IOA, it is the recipient of the Fence action

and is responsible for disabling the Slave IOA as appropriate when it occurs.


1.


2.


3.


4.


5.


A.


B.


C.

1...