Browse Prior Art Database

Automatic non-stop backup of coupling facility list structures containing shared queue messages

IP.com Disclosure Number: IPCOM000016027D
Original Publication Date: 2002-Aug-16
Included in the Prior Art Database: 2003-Jun-21
Document File: 4 page(s) / 51K

Publishing Venue

IBM

Abstract

Introduction Disclosed is a technique for non-stop periodic backup by a parallel sysplex of recoverable shared queue messages held in a coupling facility list structure. The technique automatically initiates backups as often as necessary to enable data recovery from the backup and a message log within a specified time. Backups are more frequent when the system is busy and less frequent otherwise. The technique also exploits the multiple system images within the parallel sysplex to ensure that the regular backup cycle continues across system failures. Background Shared queue messages are stored as list entries in one or more coupling facility (CF) list structures. Applications running on multiple queue managers in the same queue sharing group (QSG) anywhere in a parallel sysplex can then access these shared queue messages. This provides continuous availability, scalable capacity, and automatic pull workload balancing.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 1 of 4

  Automatic non-stop backup of coupling facility list structures containing shared queue messages

Introduction

Disclosed is a technique for non-stop periodic backup by a parallel sysplex of recoverable shared queue messages held in a coupling facility list structure. The technique automatically initiates backups as often as necessary to enable data recovery from the backup and a message log within a specified time. Backups are more frequent when the system is busy and less frequent otherwise. The technique also exploits the multiple system images within the parallel sysplex to ensure that the regular backup cycle continues across system failures.

Background

     Shared queue messages are stored as list entries in one or more coupling facility (CF) list structures. Applications running on multiple queue managers in the same queue sharing group (QSG) anywhere in a parallel sysplex can then access these shared queue messages. This provides continuous availability, scalable capacity, and automatic pull workload balancing.

     Persistent shared queue messages are supported by periodically taking non-disruptive fuzzy backups of each CF list structure, and having each queue manager in the QSG log the message identification and message content of each shared queue message it writes to the CF and the message identification of each shared queue message it destructively reads from the CF. If the CF list structure should fail its shared queue messages are recovered by restoring the latest fuzzy backup of that CF structure, positioning the recovery log of each queue manager in the QSG to the time of the fuzzy backup, and then replaying the shared queue update operations from these logs in time stamp order across the QSG into the CF structure.

     If a CF structure should fail then users will want to recover it from the fuzzy backup and recovery logs as quickly as possible. The time taken to replay the recovery logs is the major portion of this recovery time. This can be minimised by taking frequent fuzzy backups and by efficient processing during the log replay process.

     This disclosure describes a way to automatically initiate fuzzy backups of a CF list structure in such a way that the automatic backup cycle does not stop if a queue manager should fail. Automatic backup initiation is an important function given the desirability of fuzzy backups at intervals of 30 minutes or less. Problem Description

     The elapsed time taken to recover a failed CF structure from its last fuzzy backup and the recovery logs is most influenced by the frequency of the fuzzy backup. If the latest fuzzy backup of a CF structure is one week old, then the CF structure recovery program will have to replay one week's worth of recovery logs from each queue manager in the QSG. Performance analysis shows that the elapsed time for log replay in a busy system dedicated to shared queue operations is about half the elapsed time since the last fuzzy backup of that CF structure. Thus it might tak...