Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Unattended System Monitor

IP.com Disclosure Number: IPCOM000101506D
Original Publication Date: 1990-Aug-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 5 page(s) / 178K

Publishing Venue

IBM

Related People

Bartol, TM: AUTHOR [+3]

Abstract

This article describes an unattended system monitor (USM) which includes a hardware watchdog timer that prevents a software "hung" state from disabling a computer system's fault-tolerant hardware.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Unattended System Monitor

       This article describes an unattended system monitor (USM)
which includes a hardware watchdog timer that prevents a software
"hung" state from disabling a computer system's fault-tolerant
hardware.

      A hardware fault-tolerant computer system could be rendered
unusable if the application or a high priority interrupt prevents the
operating system scheduler from switching from the present task to
servicing users.  This rare condition requires an operator to reboot
the machine. Systems installed at remote (operatorless) sites could
be "hung" for long periods of time.  The USM disclosed herein uses an
independent crystal-controlled hardware watchdog timer to recover the
"hung" system automatically.

      Although a watchdog timer for software is not new, provision
for a programmable expiration time and utilization of the system's
ability to save the entire machine's status (Level 7 interrupt), if
the watchdog timer circuits fire, add new dimension to the concept.
These elements are disclosed herein.  A fail-safe power off/on
circuit backs up the Level 7 circuits to ensure hardware recovery.
The watchdog timer is activated, maintained and disabled under
software control and provides automatic recovery upon loss of
hardware control, by the operating system.  The watchdog timer is
always disabled on power up. System software must program the USM to
a time-out value and activate the watchdog timer to begin its
operation.  Once activated by a software command, the USM watchdog
timer counts up to the programmed time- out value.  If the USM
watchdog timer is reset by a software command before timing out, the
system is assumed to be operating normally and no action is taken by
the USM.

      Should the USM watchdog timer be activated and neither reset
nor disabled, the USM hardware will provide the system with a Reboot
signal (Level 7) upon reaching the time out value.  The USM will
allow a maximum of eight minutes for the system to react to the level
7 Reboot signal.  This includes copying the system's status to a
direct-access storage device (DASD) for future analysis as to the
cause of the "hung" condition and disabling the USM watchdog timer
prior to rebooting the system.  Should the system not react to the
Reboot signal within the allotted time, the USM watchdog circuits
initiate a system power off.  During the power-off condition the USM
uses storage capacitors as an internal power source to recover the
system and turn the power back on.

      Fig. 1 is a functional block diagram of the USM.  The USM
functions are disabled and reset when voltage is first applied to the
USM by way of the keep-alive reset RC circuit or the power-on-reset
RC circuit.  The RC circuit time constants provide the appropriate
signal delays.  The USM can also be disabled by the serial data
recovery circuits by decoding AO = 1 and D7 -> DO = E2 hex, as
shown in Fig. 2. Either method will cause the enab...