Browse Prior Art Database

A Heuristic Solution to the Data Flow Deadlock Across a Parallelized Sort-Merge With Skewed Input Key Values in a Non-Shared Memory Environment

IP.com Disclosure Number: IPCOM000109195D
Original Publication Date: 2005-Mar-23
Included in the Prior Art Database: 2005-Mar-23
Document File: 2 page(s) / 30K

Publishing Venue

IBM

Abstract

A method for avoiding a data flow deadlock across a parallelized sort-merge operation in a non-shared memory environment uses a simple heuristic for sending dummy key values to consumers only when needed, rather than sending them in a periodic fashion with a potentially higher fixed overhead.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 2

A Heuristic Solution to the Data Flow Deadlock Across a Parallelized Sort -Merge With Skewed Input Key Values in a Non -Shared Memory Environment

Disclosed is a method for preventing data flow deadlocks across parallelized sort-merge operations in a non-shared memory environment. Such deadlocks are described, and some solutions discussed in [1] and [2]. Among the solutions discussed is the production of "dummy" key values that can be fed to otherwise starved merge inputs to break the deadlock. The suggested approach is to send such dummy keys periodically. However, this imposes a fixed overhead on the sort-merge and may send many unnecessary dummy key values. Another known solution produces dummy key values only when needed, but requires shared memory to do so, as the sort-merge participants examine each others' state in order to determine when to generate a dummy value.

The present invention works in a non-shared memory environment while avoiding the fixed overhead of sending periodic dummy values. This method uses a low-overhead heuristic method for determining when a sort operator should supply a dummy value to a client merge operator. The dummy key values are placed in the normal output stream of the sort at least as often as they are needed to prevent a deadlock, but are not generated constantly/periodically--only in response to a specific indication that such a value could be needed.

In summary, the present invention operates as follows:

a. For a parallelized sorted merge, if a sort is about to be blocked due to flow control while sending to one of its consumers (a merge), a dummy packet which contains the current sort key value being emitted by that sort, will be sent out to all other consumers (merges), i.e., each consumer other than the one with...