Browse Prior Art Database

Method for Avoiding And Repairing Damage to Distributed Transactions in A Coordinated Resource Recovery System

IP.com Disclosure Number: IPCOM000119943D
Original Publication Date: 1991-Mar-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 5 page(s) / 200K

Publishing Venue

IBM

Related People

Ainsworth, MK: AUTHOR [+5]

Abstract

Described is a method for repairing damage in a distributed transaction environment that supports coordinated resource recovery. Coordinated resource recovery involves maintenance of transaction atomicity for all participants in the transaction.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 31% of the total text.

Method for Avoiding And Repairing Damage to Distributed Transactions
in A Coordinated Resource Recovery System

      Described is a method for repairing damage in a
distributed transaction environment that supports coordinated
resource recovery.  Coordinated resource recovery involves
maintenance of transaction atomicity for all participants in the
transaction.

      Participants cooperate in a two-phase commit protocol for
synchronizing the transaction.  Transactions are committed when one
of the participants (A) initiates a syncpoint protocol with the other
participants known to that initiator (B and H).  This syncpoint is
cascaded (through B) to other participants (C and I; C to D and E).
Terminal nodes in this syncpoint tree represent resource
participants.  Non-terminal nodes represent participants that use the
resources.

      Response to a syncpoint request is either commit (successful
transaction) or back-out (unsuccessful transaction).  Each
participating process is responsible for logging syncpoint
activities.  In the case of a failure during the syncpoint operation,
the log may be used to recover from the syncpoint failure.  Recovery
Monitors (B', C') are employed to maintain the syncpoint logs (L1,
L2) and manage the automatic recovery from failing syncpoints.  The
participating processes (B, C) utilize their local Recovery Monitors
(B',C') for logging the portion of the syncpoint activity that is
known to them.  This logging includes information about the state of
its parent in the syncpoint tree (A for B; B for C), as well as
its offspring (C and I for B; D and E for C) in the syncpoint
tree, but no other participants.

      Damage to the transaction can occur when there is a failure
during syncpoint and automatic recovery is unable to bring all
involved participants and resources to a consistent state (all
committed or all backed out).  This may result when processors or
communications between syncpoint participants fail.  Two kinds of
damage may occur:
      1.   Incomplete or delayed automatic recovery.
           In this case damage is avoided by supporting the ability
of an operator to provide surrogate responses on behalf of the
unavailable participant or resource.
      2.   State inconsistencies between participants in the
syncpoint.

      In this case a Recovery Monitor or participating resource,
lacking direction from its parent in the syncpoint tree, chooses an
action (commit or back-out) that is inconsistent with the state of
the other participants in the syncpoint that failed.  Here, damage
repair involves changing the syncpoint state of inconsistent
resources and participants to agree with the remaining resources and
participants.  Although the described method does nothing toward
repairing damage to a resource, it does support adjusting syncpoint
log records, making them consistent, and permitting the syncpoint to
complete normally.

      The opera...