Browse Prior Art Database

Software Synchronization of Independent Timers in MP Clusters using Shared Memory

IP.com Disclosure Number: IPCOM000104871D
Original Publication Date: 1993-Jun-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 2 page(s) / 105K

Publishing Venue

IBM

Related People

Lehr, TF: AUTHOR

Abstract

Disclosed is a software mechanism for synchronizing free-running timers on the nodes of a cluster MP. A four-way MP has been architected and built whose nodes are RS/6000 chip sets. The nodes share memory using an atomic complex for synchronization. Each node runs a complete AIX* 3.2 operating system with kernel extensions to manage communication via the atomic complex.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Software Synchronization of Independent Timers in MP Clusters using Shared Memory

      Disclosed is a software mechanism for synchronizing
free-running timers on the nodes of a cluster MP.  A four-way MP has
been architected and built whose nodes are RS/6000 chip sets.  The
nodes share memory using an atomic complex for synchronization.  Each
node runs a complete AIX* 3.2 operating system with kernel extensions
to manage communication via the atomic complex.

      The mechanism solves a problem concerning system functions
using dated and fine-grain data on clusters of
communicating/cooperating computer nodes, each running autonomous,
complete (AIX) kernels.  The problem is posed by computer bookkeeping
functions and measurements of the system's performance.  It can be
illustrated by two examples: Assume that a database is loaded onto
the file system of a cluster.  The database would be accessible by
each of the nodes.  Assume also that the time, as shown by the
free-running timer on each node at any particular moment, is not the
same as that shown on any of the other nodes.  If data base queries
are made which seek most-recent data, then conflicts arise if a
most-recent datum is time stamped with a date older than a
less-recent datum.

      The second example concerns performance tracing of the cluster.
Performance tracing is event-driven tracing like the kind done by the
AIX 3.2 trace Facility [*].  When tracing communication between two
or more nodes of a cluster, one must know the times when data is
transmitted and received with as much accuracy as possible in order
to credibly analyze the communication performance.  Although a trace
on a single RS/6000 running AIX reports event times precise to
nanoseconds, comparisons of the traces of individual nodes are
meaningless if the free-running timers are not synchronized.  When
tracing nodes which pass data amongst each other, one cannot match
transmissions and receptions of data between nodes if the timers are
not synchronized.

      By synchronizing the timers across the cluster, one may solve
the problems posed by these two examples and others.

      Before discussing the implementation of the synchronization
mechanism a point must be made about the design paradigm.  In the
design of the kernel, it was wished to leave the AIX 3.2 kernel as
untouched as possible, restricting itself to adding kernel
extensions.  In trying to restrict itself to kernel extensions, it
was believed that it could make the technology more appealing to
development teams.

      The synchronization is accomplish using a special system call
implemented using kernel extensions.  The call takes one parameter,
 an integer mask in which bit zero corresponds to node zero, bit one
to node one, etc. If a mask bit is on, then a node issuing the system
call will attempt to synchronize its timer with the node represented
by the mask bit.  The system call causes the...