Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Performing Work Efficiently in the Presence of Faults

IP.com Disclosure Number: IPCOM000108625D
Original Publication Date: 1992-Jun-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 1 page(s) / 56K

Publishing Venue

IBM

Related People

Dwork, C: AUTHOR [+3]

Abstract

An algorithm is disclosed for performing work in the presence of faults. The algorithm permits a set of t possibly faulty processes to accomplish n units of work, maintaining a good balance between bookkeeping messages (those informing processes about work that one process has done) and total work performed. In every execution of the protocol either all the work is performed or all the processes fail. The total work performed is O(n) units, and the total number of messages is O(t**(3/2)).

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 56% of the total text.

Performing Work Efficiently in the Presence of Faults

       An algorithm is disclosed for performing work in the
presence of faults.  The algorithm permits a set of t possibly faulty
processes to accomplish n units of work, maintaining a good balance
between bookkeeping messages (those informing processes about work
that one process has done) and total work performed.  In every
execution of the protocol either all the work is performed or all the
processes fail.  The total work performed is O(n) units, and the
total number of messages is O(t**(3/2)).

      The idea of the algorithm is as follows.  For ease of
exposition, we assume that t is a perfect square, and that n is
divisible by t.  For convenience, we assume that the processes are
numbered 0 through t-1, and that the units of work are numbered 1
through n.  We divide the processes into t**(1/2) groups of size
t**(1/2) each and divide the work into t**(1/2) chunks, each of size
n/t**(1/2), and subdivide the chunks into t**(1/2) subchunks of size
n/t.

      The algorithm guarantees that each round, at most one process
is active.  The active process is the only process performing work.
If process i is active, then it knows that processes 0 to i-1 have
crashed or terminated.  Initially, process 0 is active.  The
algorithm for process 0 is straightforward:  process 0 starts out
doing the work, a subchunk at a time.  After completing a subchunk c,
it notifies the remaining processes in its group...