Browse Prior Art Database

Method and system for maintaining synchronization of objects in an environment with an even number of redundant Nodes

IP.com Disclosure Number: IPCOM000236167D
Publication Date: 2014-Apr-10
Document File: 2 page(s) / 79K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method that allows the continuous operation of a system that consists of a single Management Node and a number of redundant managed Nodes, even when a managed Node is not in the operational state. This is accomplished through the creation of a persistent record that indicates which node made the last change.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 01 of 2

Method and system for maintaining synchronization of objects in an environment with an even number of redundant Nodes

A system consisting of a single Management Node, and N redundant managed Nodes (where N is even (N/2=0)), must have the managed Nodes kept in synchronization to ensure operational capability in the event of one managed Node failure. Each managed Node maintains a set of objects that contain information about the configuration and operational status of the system.

The managed Nodes can fail independently. Each managed Node can operate the entire system independently. Nominally, there are two managed Nodes operating in parallel, sharing the operational load between them.

A problem arises in the following situation:

• Node-2 fails • Node-1 continues operation, taking over the share of Node-2 operations
• Updates are made to the management objects (which describe the configuration, control, and status) in Node-1. Since Node-2 is in a non-operational state, it has no knowledge of the updates that have been made.


• In the normal case, when Node-2 again becomes operational, Node-1 sends any changes in its management objects to Node-2 in order to re-synchronize


• In the failure case, which this disclosure addresses, at the time when Node-2 again becomes operational, Node-1 fails. At

this point, the N Nodes cannot re-synchronize. Node-2 is operational and running the system, but has no knowledge of the changes made to the management objects in Node-1 before it failed.

The most likely cause of the failure scenario above is in the switch-over/control functions between the redundant nodes. In a

worst-case scenario, the two Nodes may 'ping-pong' between failed and operational, with neither Node having knowledge of the

changes made to management objects while that Node was in the failed state.

The obvious solution is to prevent all configuration/status changes from occurring unless all Nodes are in the operational state. However, in a system designed for high-availability, this is not an acceptable solution.

The novel contribution is a method that allows the continuous operation of the system, even when a managed Node is not in the operational state. This solution involves the creation of a persistent record that indicates which node made the last change. This allows additional changes to be made wh...