Scheme for Message-Time Tradeoff for Combining Data in Message-Passing Systems
Original Publication Date: 1993-Apr-01
Included in the Prior Art Database: 2005-Mar-19
Publishing Venue
IBM
Related People
Aggarwal, A: AUTHOR [+2]
Abstract
In a message-passing system, each processor has one piece of data to combine with pieces of data of other processors. Data is communicated by sending messages between processors in specified communication rounds. Disclosed is a scheme for trading the number of messages with the number of communication rounds required to combine data distributed among processors.
Scheme for Message-Time Tradeoff for Combining Data in Message-Passing Systems
In a
message-passing system, each processor has one piece of
data to combine with pieces of data of other processors. Data is
communicated by sending messages between processors in specified
communication rounds. Disclosed is a
scheme for trading the number
of messages with the number of communication rounds required to
combine data distributed among processors.
The operation
of combining data that are distributed among
processors is frequently used in message-passing systems. In such
systems, processors communicate by sending messages in specified
communication rounds. In each
communication round, a processor can
send one message to another processor and can receive one message
from another processor. Two complexity
measures are usually used for
evaluating communication algorithms: the number m of messages sent,
and the number t of communication rounds required.
The combining
problem can be formulated as follows.
The system
consists of n processors p sub 0, p sub 1,....., p sub <n-1>.
Initially, processor p sub i has a piece of data d sub i. The goal
is to compute the result of d sub 0 + d sub 1 + ..... + d sub <n-1>,
where + is some associative function, and to distribute the result to
all n processors. It is assumed that
each processor can use the
function + to combine data items.
Two standard
algorithms for combining data are used.
One
algorithm first uses a binary tree to reduce data from the n
processors into processor p sub 0, and then uses a binary tree to
broadcast the reduction result from processor p sub 0 to all the n
processors. In this algorithm, a total
of m=2(n-1) messages are sent
in t=2 log n communication rounds.
Another algorithm creates
multiple reduction trees in parallel on the n processors, each rooted
at a different processor. This algorithm
establishes communication
between processors according to the structure of the butterfly graph.
In this algorithm, a total of m=n log n messages are sent in t=log n
communication rounds.
In each of
the two common algorithms for combining data among n
processors, one of the parameters m and t is minimized at the expense
of increasing...