Browse Prior Art Database

Shared Memory-Based Protocol for Transaction Recovery in Distributed DB/DC Systems

IP.com Disclosure Number: IPCOM000039326D
Original Publication Date: 1987-May-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 3 page(s) / 62K

Publishing Venue

IBM

Related People

Iyer, BR: AUTHOR [+3]

Abstract

This article describes a transaction recovery protocol for a distributed DB/DC system based on shared memory (which contains multiple DB and DC subsystems residing on different processors). A shared queue (based on the shared memory) is used to exchange messages between subsystems and for recovery. Only a minimal normal processing overhead is incurred to provide for recovery capability. Assume that a reliable shared memory is the main mechanism of inter-system communication. The shared memory provides a shared queue facility for the exchange of messages between subsystems. The described protocol provides for the management of messages in the shared queues in such a way as to avoid the duplication of transaction execution or loss of a transaction after a subsystem failure.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Shared Memory-Based Protocol for Transaction Recovery in Distributed DB/DC Systems

This article describes a transaction recovery protocol for a distributed DB/DC system based on shared memory (which contains multiple DB and DC subsystems residing on different processors). A shared queue (based on the shared memory) is used to exchange messages between subsystems and for recovery. Only a minimal normal processing overhead is incurred to provide for recovery capability. Assume that a reliable shared memory is the main mechanism of inter-system communication. The shared memory provides a shared queue facility for the exchange of messages between subsystems. The described protocol provides for the management of messages in the shared queues in such a way as to avoid the duplication of transaction execution or loss of a transaction after a subsystem failure. Recovery Protocol Shared queues (in one or more reliable shared memory modules) can provide a means to combine communication between subsystems and the securing of transaction progression into one. In the figure, three subsystems (SS1, SS2, and SS3) are shown, with shared queues SQ1 and SQ2. Shared queue SQ1 receives messages from SS1 and delivers messages to SS2. In the general usage of a queue, a message is deleted from a queue at the moment it is dequeued. This is not appropriate here, because the shared queue is also intended to provide the function of a logging device. If, subsequent to dequeueing a message, SS2 fails before it is able to process the message and enqueue the processed message into SQ2, a transaction would be lost. Hence, a message must be retained in SQ1 (even after it is dequeued by SS2) until the processed message is enqueued in SQ2. If SS2 fails, all dequeued messages in SQ1 that do not appear in SQ2 can be replayed. This poses still another subtle problem. Consider the case where a message enqueued in SQ2 is quickly dequeued and processed by SS3 and enqueued in SS3's output queue. When the message is deleted from SQ2, due to the asynchronous nature of message deleting, the corresponding transaction message may still appear in SQ1. If SS2 fails, the original message in SQ1 will be replayed to SS2, constituting a double execution of the transaction. One way to prevent this is not to remove a message from SQ2 until its corresponding message is removed from SQ1. A more practical way, described below, uses an acknowledgement to SQ1 when the message has been enqueued at SQ2, indicating that this message should no longer be considered for retransmission from SQ1. Two acknowledgements are used to signal the events of enqueueing and dequeueing a message from a shared queue. A subsystem...