Browse Prior Art Database

Neighbor and Maverick Fault Recovery in a Bidirectional Insertion Ring

IP.com Disclosure Number: IPCOM000036666D
Original Publication Date: 1989-Oct-01
Included in the Prior Art Database: 2005-Jan-29
Document File: 4 page(s) / 56K

Publishing Venue

IBM

Related People

Hall, WD: AUTHOR [+2]

Abstract

This article describes a technique for recovery of a neighbor or maverick fault in a bidirectional insertion ring.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 4

Neighbor and Maverick Fault Recovery in a Bidirectional Insertion Ring

This article describes a technique for recovery of a neighbor or maverick fault in a bidirectional insertion ring.

A bidirectional insertion ring can experience a failure or fault in many ways. The recovery procedures described herein may be used to circumvent the following faults: 1. Continuous parity or cyclic redundancy check (CRC) error received from primary neighbor.

(Image Omitted)

2. Continuous parity or CRC error received from

secondary neighbor. 3. Continuous physical code violations received from primary neighbor. 4. Continuous physical code violations received from secondary neighbor.

These types of faults are characterized by the presence of a carrier signal and a block check character error or a physical protocol error which is continuous in nature and thus distinguished from a soft error.

In a distributed data processing (DDP) system such as the one disclosed in
[*], the local insertion bidirectional ring architecture allows for high availability via the use of two rings which transmit data in opposite directions, as illustrated in Fig. 1. When a fault occurs, the system operations port (SOP) is notified of the fault and will take the appropriate steps to circumvent the problem completely without software intervention. A fault refers to an error which cannot be corrected via re-tries, such as a severed ring, but which may be circumvented by using an alternate path.

The alternate path can be established via two methods. One method is to terminate operation on the ring that has had the fault - single ring operation. The second method is to logically wrap the rings together on each side of the fault - wrap operation.

The single ring operation method is simple to implement, but does not solve the problem of severing both rings or the problem of multiple faults which affect both rings. Both are very feasible problems at initial installation of a distributed system. Additionally, single ring operation increases the traffic on the ring (as seen by any one drop) significantly.

The wrap operation method is more complicated to implement, but solves the above problems since any one fault can be isolated from the operational ring without loss of any drop and multiple faults can be isolated with the loss of only the drops between the faults. Additionally, since the faults are isolated from the operational ring, replacement of the failing components can be done without taking the ring off line.

1

Page 2 of 4

In the fault recovery procedures using the wrap operation method, assume that the secondary transmitter of drop 3 fails at time = T such that the secondary receiver of port 1 no longer receives a carrier signal.

When the secondary transmitter of drop 3 fails, the loss of the carrier signal is detected by port 1 and the port issues the secondary ring fault order to the SOP
O.

When the SOP receives this order, it will issue the Broadcast Primary Wrap order to...