Browse Prior Art Database

Eliminating Two-Phase Commits for Function Shipping with a Log Server

IP.com Disclosure Number: IPCOM000111086D
Original Publication Date: 1994-Feb-01
Included in the Prior Art Database: 2005-Mar-26
Document File: 2 page(s) / 147K

Publishing Venue

IBM

Related People

Bhide, A: AUTHOR [+3]

Abstract

Disclosed is a method to eliminate two phase commits for shared-nothing database machines with a log server. The basic idea is to return log information from participating nodes to a coordinator which initiates logging, resulting in significantly reduced communciations overhead.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 38% of the total text.

Eliminating Two-Phase Commits for Function Shipping with a Log Server

Disclosed is a method to eliminate two phase commits for
shared-nothing database machines with a log server.  The basic idea
is to return log information from participating nodes to a
coordinator which initiates logging, resulting in significantly
reduced communciations overhead.

      In a shared nothing database system, the database is
partitioned among the nodes; data at remote nodes is accessed by
making remote database calls to the partition owning the data.  If a
transaction updates data at n remote nodes, all n remote nodes are
involved in two phase commit processing.  This involves 3 times n
messages between nodes (n acknowledges to the commit/abort messages
can be batched, and are therefore excluded).  A log server may be
used to amortize log disks over a number of nodes, but 3 times (n+1)
additional messages are needed for communication from all the nodes
to the log server, for a total of 6 times n + 3  messages.  For
example, for 10 remote nodes in a transaction, and a communications
pathlength of 5K instructions for a send or receive message, the
communication costs for commit processing are about 630K instructions
with additional overhead for prepare and force overheads at each
node.  This overhead can often be larger than the basic transaction
pathlength.  Thus, the objective of this invention is to devise a
method for reducing this overhead.

      A transaction runs at one node designated as the coordinator,
and makes remote database (e.g., SQL) requests to nodes at which it
needs to access data.  The basic idea is to return log information
from remote nodes to the coordinator, along with the database call
results.  At commit time, the coordinator sends a single log message,
including all the log records associated with the nodes accessed, to
the log server, which maintains different logical log files for each
partition.  The log server extracts the individual log records and
appends them to the appropriate log files.  The log servers sends an
acknowledge message to the coordinator, which returns the committed
results to the user, and asynchronously sends messages to the
participant nodes to release the locks, and perform any needed
clean-up operations.  Only a single phase commit to the log server is
needed (the prepare phase is eliminated altogether).  The number of
messages in the commit process is reduced to 2  (plus n asynchronous
messages which can be batched).  This compares with the  6 times n +
3  messages previously.

      The log server demultiplexes log records received from the
coordinator into separate logical log files; one for each partition.
Two possible ways that the log server can be organized are: (1) each
logical log file is written in a separate physical log file (2) all
the logical log files are mapped to a single physical log file.  In
case (1), the original scheme requires
 2 times (n + 1) I/Os (for...