Scheme for High Data Availability from a Shared-Cache in the Event of Hardware/Software Failures in a Multi-system Shared Disk Environment
Original Publication Date: 1994-Dec-01
Included in the Prior Art Database: 2005-Mar-28
Bhargava, G: AUTHOR [+3]
This invention presents a scheme for providing high data-availability from a shared-cache in the presence of system/cast out failures in a multi-system, shared-disk, transaction processing environment.
Scheme for High Data Availability from a Shared-Cache in
of Hardware/Software Failures in a Multi-system Shared Disk Environment
invention presents a scheme for providing high
data-availability from a shared-cache in the presence of system/cast
out failures in a multi-system, shared-disk, transaction processing
electronic cache can be used for propagating updates
between database management systems (DBMSs) in a multi-system data
sharing complex with shared disks. Presented in this invention is a
scheme for isolating pages of the shared cache that are impacted by
temporary DBMS or cast out failures. Schemes for quick recovery of
such pages are also described.
The "timestamp advancing mechanism" can be summarized
o Each DBMS maintains a local timestamp which is earlier than the
timestamp of any page that has not been externalized. This
timestamp is referred to as the sys-timestamp.
o When a DBMS updates a page, it writes a log record and
subsequently writes the page to the shared cache as a changed
page. With the write operation it provides its sys-timestamp.
o The cast out process computes the min(sys-timestamp) across the
systems and updates the timestamp in the cache directory entry
the page after it has written the page to disk.
o Periodically, by computing the minimum timestamp across a)
sys-timestamp of all systems before scanning the cache and b)
pages in the cache, it is possible to establish a position in
time-sequenced merged log(s) so that all updates to cached
not already reflected in their respective disk versions, can be
captured through the log records. Consequently, these pages
be recovered from disk in case the cache fails. This timestamp
is called the recover-timestamp.
mainline operations of DBMSs, the cast out process reads
sys-timestamps periodically and computes their minimum. Even if a
system in the data-sharing complex is down, its sys-timestamp (which
is not progressing) will be taken into account. This has the effect
of not allowing the recover-timestamp to move forward. For data
availability reasons, this may be an unacceptable situation; in this
invention, a solution to remedy this situation is presented.
problem occurs when a page cannot be cast out to disk
because, for example, disk connectivity is temporarily lost. The
recover timestamp does not move forward because the directory entry
in the cache for such a page still reflects the old value.
for isolating failed pages, a page error list is
used. The purpose of the error list is to track pages for which the
DBMS may disallow read/write access and to track the starting and
ending timestamps used for the recovery of such pages. T...