Browse Prior Art Database

Fast Protocol Resilient to a Wide Class of Failures

IP.com Disclosure Number: IPCOM000105110D
Original Publication Date: 1993-Jun-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 40K

Publishing Venue

IBM

Related People

Garay, JA: AUTHOR [+2]

Abstract

A common technique for constructing computer systems that are able to tolerate faults is to replicate the computation on several processors. In order to apply this technique, two tools are necessary: a fault-tolerant Broadcast method for broadcasting data to all processors (such data may originate either from one of the processors or as the output of a sensor) and a fault-tolerant Consensus method allowing the processors to reach consensus on the computation's result, so that they may take a common action (e.g., activating an actuator). These two methods have been studied in the abstract and are known respectively as the Byzantine Generals Problem and the Consensus Problem.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 72% of the total text.

Fast Protocol Resilient to a Wide Class of Failures

      A common technique for constructing computer systems that are
able to tolerate faults is to replicate the computation on several
processors.  In order to apply this technique, two tools are
necessary:  a fault-tolerant Broadcast method for broadcasting data
to all processors (such data may originate either from one of the
processors or as the output of a sensor) and a fault-tolerant
Consensus method allowing the processors to reach consensus on the
computation's result, so that they may take a common action (e.g.,
activating an actuator).  These two methods have been studied in the
abstract and are known respectively as the Byzantine Generals Problem
and the Consensus Problem.

      Methods have been disclosed for Broadcast and Consensus, as
well as a means for combining the methods to implement a
fault-tolerant computing system.  The salient feature of these
methods was the ability of the n processors to overcome up to t
processor failures where processors may fail either by crashing or
acting arbitrarily (even maliciously), subject to no more than b
processors acting arbitrarily.  The running time of these protocols
was 5 times (t+1).

      Protocols for the same problem whose running time is 3 times
(t+1), at the expense of requiring  a slightly higher processor
redundancy are disclosed here i.e., the total number of required
processors is n > 2t + b (as opposed to n > t + 2b in the previous
di...