Browse Prior Art Database

Efficient Recovery for Interrupted Lock Operations Disclosure Number: IPCOM000033028D
Original Publication Date: 2004-Nov-22
Included in the Prior Art Database: 2004-Nov-22
Document File: 4 page(s) / 60K

Publishing Venue



Provides a means to provide recovery for interrupted lock operations with minimal impact to mainline locking performance.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 1 of 4

Efficient Recovery for Interrupted Lock Operations

Some operating systems provide a means to cancel or force user tasks while they are running. If these tasks are running system code then a means for recovery needs to be in place to ensure that data structures and locks are left in consistent states, thus no hangs or abends should occur and the program should continue to function correctly. It is particularly difficult to provide recovery for platforms that have the compare and swap (CS) and compare double and swap (CDS) instructions as it is hard to know whether those instructions were run if a user task is interrupted while running those instructions. Since these instructions are general serialization instructions used in locking facilities, efficient recovery algorithms need to be provided to handle such cases without a negative impact to system performance. The scheme presented here handles a possible interrupted lock operation by placing the lock in an "unsure" state, and resolving the "unsuredness" by examining the state of the system and then placing the lock back in a consisten state, this avoids hangs and incorrect operation and provides the required recovery for the cancel or force command. Normal locking operation performance is not affected until and unless a task is cancelled or forced while updating a lock, which is a rare occurence in practice. The following scheme is used to provide efficient locking operations and yet provide recovery if a locking operation is interrupted. Each running task in the program is represented by a data structure referred to as a CM_RECOV block. A WAIT_ELEMENT is used to describe a waiting task, it contains information used to wake the task up when it is allowed to obtain the lock. A lock is a doubleword of storage with the following format:



e - high order bit of first word, indicates lock is held in write mode R - the address of the CM_RECOV block if the lock is held in write mode, or the count of processes that hold the lock if the lock is held in read mode. w - low order bit of first word, indicates there is contention for the lock or some other issue like the lock is in an unsure state.

W - Address of first WAIT_ELEMENT on the linked list of WAIT_ELEMENTS that represent waiting tasks for the lock. u - unsure bit, low order bit of second lock word, this means that the lock is in an unsure state and that a task is working on resolving the "unsuredness".

A write lock obtain without contention simply issues a CS instruction to store its CM_RECOV block address in the lock, very fast, very simple, very effective.

Each CM_RECOV block contains two lists and a lock:
1. A list of free HELD_LOCK elements. A HELD_LOCK structure is used to repesent a held or pending read lock for a task.
2. A list of held read locks, hence a list of HELD_LOCK structures representing held read locks for the process.
3. A lock in the CM_RECOV structure, the lock is ONLY obtained in write mode.

A read lock obtai...