Browse Prior Art Database

Database Buffer Management for High Availability

IP.com Disclosure Number: IPCOM000104392D
Original Publication Date: 1993-Apr-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 222K

Publishing Venue

IBM

Related People

Copeland, GP: AUTHOR

Abstract

This disclosure presents a method for increasing availability in the Database Manager (DBM). The DBM uses a buffer to reduce disk I/O. A larger buffer further reduces disk I/O for both reads and writes during normal operation, i.e., when the system is operating properly. This allows the exploitation of increasingly cheaper memory to increasingly improve performance. However, a larger buffer causes recovery time after a system crash, i.e., when the contents of memory are lost or potentially corrupted to increase. Therefore, in the current DBM, there is an increasingly difficult tradeoff between efficient normal operation and high availability. As the cost of memory decreases and the need for both performance and availability increases, this tradeoff becomes more undesirable.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 27% of the total text.

Database Buffer Management for High Availability

      This disclosure presents a method for increasing availability
in the Database Manager (DBM).  The DBM uses a buffer to reduce disk
I/O.  A larger buffer further reduces disk I/O for both reads and
writes during normal operation, i.e., when the system is operating
properly.  This allows the exploitation of increasingly cheaper
memory to increasingly improve performance.  However, a larger buffer
causes recovery time after a system crash, i.e., when the contents of
memory are lost or potentially corrupted to increase.  Therefore, in
the current DBM, there is an increasingly difficult tradeoff between
efficient normal operation and high availability.  As the cost of
memory decreases and the need for both performance and availability
increases, this tradeoff becomes more undesirable.

      This disclosure significantly softens this tradeoff by allowing
the write buffer to be a configurable fraction of the read buffer, so
that the tradeoff is only between write buffering and availability.
The two benefits of this are that:

1.  a large read buffer no longer requires sacrificing any
    availability, and
2.  fewer pages are typically needed for write buffering than for
    read buffering because more pages are typically read than
    written.

BACKGROUND - The DBM uses a buffer to reduce both read and write disk
I/O.  Read buffering is relatively simple and is used in most
operating systems.  Write buffering is considerably more complex.  It
requires updated pages to not be forced to disk at the end of a
recoverable and durable trans action.  Instead, only a log containing
the updates is written prior to transaction commit.  This is called
"write-ahead logging" (or "WAL") using a no-force buffer policy.  The
DBM uses the Aries algorithm for recovery [*], for which several
patents have been applied.

      A system crash is defined as the loss of memory.  It can be due
to a power outage, a software malfunction in the DBM or a software
malfunction in the operating system.  Pages are tagged with an lsn
(log sequence number that indicates the log position when the page
was last updated) which always increases in time.  A page in the
buffer always has an lsn that is greater than or equal to its version
on disk.  If the buffer and disk versions of the page have the same
lsn, then the buffer page is called "clean".  If the buffer page lsn
is greater than its version on disk, then the buffer page is called
"dirty".  In addition to the lsn on each page (its "pagelsn"), the
buffer manager also remembers the lsn of the first time a buffer page
got dirty (its "pminlsn"; a.k.a "reclsn" in Aries).  Fig. 1
illustrates an example of the state of pages during normal
processing.  Of the 8 total pages, 4 pages are in the buffer (p1, p3,
p4 and p6).  Of those 4 pages in the buffer, 2 pages are clean (p4
and p6) and 2 pages are dirty (p1 and p3).

      Using WAL, rec...