Establishing a Configuration COORDINATOR for Highly Available Systems
Original Publication Date: 1983-Jun-01
Included in the Prior Art Database: 2005-Feb-07
A function that must be provided in a highly available system is that of coordinating the detection of and recovery from failures. This function is performed by distributed software subsystems, called auditors, each of which resides in a separate processor. An auditor is a collection of tasks that are responsible for (l) recording the failures reported by the operating system, database subsystems, data communications subsystem, and failures the auditor itself may determine by diagnosing these subsystems; (2) initiating and monitoring appropriate actions to reconfigure the system, thereby shielding the users from the effects of subsystem or processor failures; and (3) responding to system status queries and system reconfiguration requests from the operator.