A Heuristic Solution to the Data Flow Deadlock Across a Parallelized Sort-Merge With Skewed Input Key Values in a Non-Shared Memory Environment
Original Publication Date: 2005-Mar-23
Included in the Prior Art Database: 2005-Mar-23
A method for avoiding a data flow deadlock across a parallelized sort-merge operation in a non-shared memory environment uses a simple heuristic for sending dummy key values to consumers only when needed, rather than sending them in a periodic fashion with a potentially higher fixed overhead.
A Heuristic Solution to the Data Flow Deadlock Across a Parallelized Sort -Merge With Skewed Input Key Values in a Non -Shared Memory Environment
Disclosed is a method for preventing data flow deadlocks across parallelized sort-merge operations in a non-shared memory environment. Such deadlocks are described, and some solutions discussed in  and . Among the solutions discussed is the production of "dummy" key values that can be fed to otherwise starved merge inputs to break the deadlock. The suggested approach is to send such dummy keys periodically. However, this imposes a fixed overhead on the sort-merge and may send many unnecessary dummy key values. Another known solution produces dummy key values only when needed, but requires shared memory to do so, as the sort-merge participants examine each others' state in order to determine when to generate a dummy value.
The present invention works in a non-shared memory environment while avoiding the fixed overhead of sending periodic dummy values. This method uses a low-overhead heuristic method for determining when a sort operator should supply a dummy value to a client merge operator. The dummy key values are placed in the normal output stream of the sort at least as often as they are needed to prevent a deadlock, but are not generated constantly/periodically--only in response to a specific indication that such a value could be needed.
In summary, the present invention operates as follows:
a. For a parallelized sorted merge, if a sort is about to be blocked due to flow control while sending to one of its consumers (a merge), a dummy packet which contains the current sort key value being emitted by that sort, will be sent out to all other consumers (merges), i.e., each consumer other than the one with...