Browse Prior Art Database

An Efficient Fault-tolerant Membership Protocol

IP.com Disclosure Number: IPCOM000062792D
Original Publication Date: 1986-Dec-01
Included in the Prior Art Database: 2005-Mar-09
Document File: 1 page(s) / 12K

Publishing Venue

IBM

Related People

Aghili, H: AUTHOR [+3]

Abstract

An efficient fault-tolerant protocol for computing the processor membership of a dynamically-evolving distributed processor system is proposed. The proposed protocol guarantees finite processor failure and join detection delays, provides all correctly functioning processors with identical membership information a any local clock time, is tolerant of any number of likely faults except network partitioning, and has minimal run-time overhead when no faults occur. In addition, while assuming no shared storage among processors, the protocol scales well when the number of processors of a system increases.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 60% of the total text.

Page 1 of 1

An Efficient Fault-tolerant Membership Protocol

An efficient fault-tolerant protocol for computing the processor membership of a dynamically-evolving distributed processor system is proposed. The proposed protocol guarantees finite processor failure and join detection delays, provides all correctly functioning processors with identical membership information a any local clock time, is tolerant of any number of likely faults except network partitioning, and has minimal run-time overhead when no faults occur. In addition, while assuming no shared storage among processors, the protocol scales well when the number of processors of a system increases.

For the proposed protocol, there are two kinds of changes in network processor membership which must be anticipated: (1) reductions in the number of correctly functioning processors in the set, and (2) additions of processors to such set. When the processor membership of the network remains the same, the protocol is said to be in "steady-state" mode. In this mode, member processor periodically transmits "alive" messages, which include identity information, and periodically monitors corresponding messages from other member processors. Thus, abs of a timely alive message from a member processor is interpreted as a failure of that processor.

When a new processor joins previously joined processors in the network, it transmits a membership update message to all member processors and the network is said to be in "up-date" mode...