Browse Prior Art Database

Shared Disk, UNIX-Based, Cluster File System

IP.com Disclosure Number: IPCOM000112873D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 88K

Publishing Venue

IBM

Related People

Devarakonda, M: AUTHOR [+2]

Abstract

Disclosed is method of constructing a cluster file system that provides direct and concurrent access to file data stored on a disk from two or more processors of a cluster system. Data accesses are in the form of vnode and VFS operations which can be simultaneously issued from the cluster processors. The disclosed method includes serialization and data coherency which guarantee that the results of two vnode or VFS operations, say A and B, on the file data are such that either A is executed after B is complete or vice versa.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Shared Disk, UNIX-Based, Cluster File System

      Disclosed is method of constructing a cluster file system that
provides direct and concurrent access to file data stored on a disk
from two or more processors of a cluster system.  Data accesses are
in the form of vnode and VFS operations which can be simultaneously
issued from the cluster processors.  The disclosed method includes
serialization and data coherency which guarantee that the results of
two vnode or VFS operations, say A and B, on the file data are such
that either A is executed after B is complete or vice versa.

      The method is described here as a parallelization of the
Journaling File System (JFS) of AIX* for a cluster of RISC
Systm/6000** processors each running AIX.  The scheme consists of the
following components:

o   Serialization mechanism

o   Cluster-wide, distributed lock manager

o   Data coherency mechanism

o   Multi-node journal management for JFS

o   Handling of JFS file meta data ("special" files)

o   Shared disk access

      Each vnode or VFS operation of the JFS is intercepted and a
cluster-wide lock is obtained on files or parts of files involved in
the operation.  The lock may be a read or write lock depending on the
intended use.  If the necessary lock is not available in the
necessary mode, the requester is made to wait.  Once the needed locks
are obtained, the vnode or VFS operation will proceed as in the JFS.
When the operation is complete, all the locks obtained in the
beginning are released.  This lock and unlock operations constitute
the serialization mechanism, and they protect the integrity of the
file data even when the vnode and VFS operations are issued
simultaneously from multiple nodes.

      A cluster-wide, distributed lock manager provides the lock and
unlock operations on a file or a part of a file.  The lock manager
employs a token management scheme to take advantage of locality of
references in file access and thereby reduce inter-processor
communication.

      In addition to the serialization mechanism, data coherency
scheme is needed because the base file system, JFS, extensively
caches file data in virtual memory.  The data coherency mechanism
assures that stale data is discarded from a processor's virtual
memory and changes made to the data in a processor's file cache are
propagated cluster-wide as appropriate.

      JFS employs a log, and the invention here provides a multi-node
log writing mechanism.  The log writ...