Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Recovery of Data Sharing Systems in the Absence of a Synchronized Clock

IP.com Disclosure Number: IPCOM000104991D
Original Publication Date: 1993-Jun-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 193K

Publishing Venue

IBM

Related People

Dan, A: AUTHOR [+3]

Abstract

Disclosed is a recovery scheme in a shared-disk system where each node writes its own log, and where logs from various nodes can be merged for node recovery without requiring synchronized clocks, and furthermore, providing faster recovery using a global lock manager.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 28% of the total text.

Recovery of Data Sharing Systems in the Absence of a Synchronized Clock

      Disclosed is a recovery scheme in a shared-disk system where
each node writes its own log, and where logs from various nodes can
be merged for node recovery without requiring synchronized clocks,
and furthermore, providing faster recovery using a global lock
manager.

      In a shared-disk system where a page is not necessarily flushed
to the disk before its control (write access) is passed from one node
to the other, several updates by different nodes on the same page may
be pending (not propagated to the disk) and the node holding the
latest copy bears the propagation responsibility.  In case of a
failure of the node holding the propagation responsibility,  all the
pending updates on that page by different nodes need to be replayed
in the right order.  Therefore, the log entries of all the nodes
describing the update on that page need to be totally ordered by
merging the various log streams from the nodes during recovery.
Under the assumption that the systems have synchronized clocks and
where the LSN's (Log Sequence Number) of the log records reflect the
system clock, this is relatively simple.  However, a scheme is
described wherein the clocks need not be synchronized; yet log merge
is trivial.

      Furthermore, standard database recovery scheme such as ARIES
[4] use the pending update state as captured at the last checkpoint
and make pessimistic assumptions about the state since that time till
the crash.  An appropriate global lock manager is shown to improve
upon the recovery times by providing a snapshot of the pending update
state at the time of the node failure.

      There are two orthogonal facets to this invention, and each of
these will be described separately.  The first facet provides an way
to put a logical order to the log entries of various nodes describing
update on a particular page, and the second facet provides ways to
improve on the first:

o     Using a scheme analogous to Lamport's clock scheme [2] to
    synchronize the merging of logs
o     Using a global lock manager  to recover the database faster
    than standard ARIES recovery.

      In [5], a method for log synchronization is described that
enables log merges during the undo phase.  The architecture assumes
that whenever the ownership of a page is transferred, the page is
also flushed to the disk.  Consequently, for redo processing, the log
from the crashed node only needs to be replayed, and hence no log
merge is required.  However, in the presence of long running
transactions, the logs describing the undos on a page might be
present on several systems, and need to be applied in the right
order.  For this, each page carries with it a logical update sequence
number (USN), which is incremented every time a log for that page is
written.  On a per page basis, all log records for that page have a
global ordering across all the nodes, bas...