Browse Prior Art Database

A method to recovery business critical filesystems before others in a clustered filesystem Disclosure Number: IPCOM000216082D
Publication Date: 2012-Mar-23
Document File: 2 page(s) / 45K

Publishing Venue

The Prior Art Database


Clustered Filesystems are becoming more and more popular and several vendors are bringing to market their own versions of it. A clustered filesystem is highly available and is used to provide reliable fault tolerant data access to applications and services built on top of it. A clustered filesystem might be managing hundreds of filesystem in an installation of which only a subset might be business critical. If we could encapsulate the business priorities in a clustered filesystems we could improve the availability.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 2

A method to recovery business critical filesystems before others in a clustered filesystem

A clustered filesystem (eg GPFS, OCFS , GFS etc) is used to provide fault tolerant reliable data access to Applications and Services built on top of it. An example of this is in the DB2 pureScale architecture that uses GPFS as the reliable shared data store.

    In the event of a failure, such a filesystem needs to employ recovery operations so as to restore the integrity of the filesystems.

    However, in the event of node failures and/or disk failures, recovery is performed by other healthy members of a cluster and they sometimes take an excessively long time, and SLAs for the various services built on top of it are missed.

    The current art is to take all the filesystems to be recovered, distribute them to the surviving nodes and perform the recovery in some random order sequentially one after the other.

    However this process fails to exploit options to recover the business critical filesystems before others. One way to overcome this is to indicate the business priority of recovery to the underlying architecture as a configurable. Once that is done the cluster will have knowledge of what filesystems are important to applications which are built on top of it.

    Since the cluster now has a knowledge of what filesystems are business critical, it can implement a priority recovery algorithm to recover the most important filesystems first.

    In addition, higher order dependent systems need not wait for all filesystems to be recovered but can be informed when each individual filesystem of interest is recovered and react accordingly. This is important as several clustered filesystems use the node fencing operations to perform recovery so as to provide data integrity.

    The first step of a solution along these lines is to provide an interface (for example, an administration command) that captures the following:

a list of filesystems and priority class being a number ranging from 1-MAX where


smaller numbers mean lower priority and higher values mean higher priority. implement a priority recovery algorithm that takes into account the priority class


  before scheduling them for recovery. The details of it follows as pseudocode: for priority-class in [ MAX : 1 ]


int n = getSurvivingNodes()
array fs = getFilesystemsInPriorityClass(priority-class) distributeWork(fs,n)

array fs = getNonPriorityFS() int n = getSurvivingNodes() distributework(fs,n)

    Each of the pseudo functions can exploit the best algorithm in the industry currently so that overall recovery of important filesystems are performed before the less important ones. For example, recovering the tablespaces filesystem for a database would be more business critical than recoveri...