Browse Prior Art Database

Minimise coupling facility (CF) accesses during CF recovery from backup and logs

IP.com Disclosure Number: IPCOM000013422D
Original Publication Date: 2002-Jun-16
Included in the Prior Art Database: 2003-Jun-18
Document File: 4 page(s) / 55K

Publishing Venue

IBM

Abstract

OVERVIEW

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 26% of the total text.

Page 1 of 4

  Minimise coupling facility (CF) accesses during CF recovery from backup and logs

OVERVIEW

Disclosed is a technique to minimise the number of coupling facility accesses needed to restore shared queue messages of a message broker such as IBM's MQSeries* for OS/390* product held in a coupling facility list structure from a prior fuzzy backup of that list structure and changes to that list structure held in recovery logs. The technique exploits the fact that the coupling facility recovery process has exclusive access to the list structure during list structure recovery. This means that the rules which apply to list structure updates during normal operation need not apply during the recovery process. All that matters is that the list structure is correct at the end of the recovery process. It does not need to be correct at all times during the recovery process.

BACKGROUND

    Some message brokers (eg the MQSeries product for OS/390) have support for shared queues. Shared queue messages are stored as list entries in one or more coupling facility (CF) list structures. Applications running on multiple queue managers in the same queue sharing group (QSG) anywhere in a parallel sysplex can then access these shared queue messages. This provides continuous availability, scalable capacity, and automatic pull workload balancing.

    It is possible to provide support for persistent shared queue messages by periodically taking nondisruptive fuzzy backups of each CF list structure, and having each queue manager in the QSG log the message identification and message content of each shared queue message it PUTs and the message identification of each shared queue message it GETs. If the CF list structure should fail it is then possible to recover its shared queue messages by restoring a latest fuzzy backup of that CF structure, positioning the recovery log of each queue manager in the QSG to the time of the fuzzy backup, and then replaying the shared queue update operations from these logs in time stamp order across the QSG into the CF structure.

    If a CF structure should fail then customers will want to recover it as quickly as possible. The time taken to replay the recovery logs is the major portion of this recovery time. This can be minimised by taking frequent fuzzy backups and by efficient processing during the log replay process.

    This disclosure describes techniques to minimise costly CF accesses during log replay, and to thus reduce the elapsed time to recover the failed CF list structure.

PROBLEM DESCRIPTION

    Each shared queue corresponds to an MQPUT list header in a CF list structure. Each shared queue message corresponds to a list entry on that list header. Each list entry on an MQPUT list is identified by a unique key. The first byte of this key identifies whether the message is committed or uncommitted. MQGETs from a queue only look at the committed section of the MQPUT list for that queue except for the special case where the MQGET is in the same u...