Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Autonomic Watchdog Algorithm and Implementation Improving System Recovery While Preventing Infinite Reset Loops

IP.com Disclosure Number: IPCOM000035613D
Original Publication Date: 2005-Jan-26
Included in the Prior Art Database: 2005-Jan-26
Document File: 2 page(s) / 34K

Publishing Venue

IBM

Abstract

Autonomic Watchdog Algorithm and Implementation Improving System Recovery While Preventing Infinite Reset Loops

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Page 1 of 2

Autonomic Watchdog Algorithm and Implementation Improving System Recovery While Preventing Infinite Reset Loops

Watchdog timers are widely used to recover from software/firmware hangs in processor based systems. The basic premise of a watchdog timer is for a processor to "ping" an address or register within a preset amount of time, if that "ping" doesn't occur before the preset time expires, the processor is reset in hope of recovering from the error. There are currently two methods available to implement watchdog timers. Many processors have built in soft watchdog timers so that if a processor doesn't service the timer before it expires, it will initiate a soft reset of the processor. A second method is to use an external watchdog chip to perform the task. Many microprocessor supervisor chips have watchdog timers built in such that if one of the pins is not toggled within a preset time (usually hard coded), either a hard reset or NMI is generated to the processor depending on the component chosen. With these traditional methods, there is no obvious way to prevent reset loops and improved recoverability.

The major drawback to these implementations is the ease at which infinite reset loops can be generated. If the hang condition is early on, or even during the boot code, there may not be a method to disable the watchdog in time. The processor will begin to boot, hit the hang condition, and then get reset, this loop can repeat indefinitely. If there are other processors or intelligent devices in the system, monitoring the status of such watchdog events can create significant overhead servicing the events. For instance, if there are redundant processors, this loop can cause problems with trading off master/slave functionality.

To prevent such reset loops, this publication proposes a new algorithm and implementation which will provide the reset functionality given by traditional watchdog circuits while preventing reset loops and give statistics providing the processor information which could aid in recovering from the error.

The implementation can be done using a programmable logic device such as a CPLD or FPGA. There are two major benefits to this solution. One is the ability to enable the watchdog timer without causing a reset condition, and the other is to track statistics on how many times the timer has expired. This information and ability gives the processor a better chance of recovering from the error as well as "cleanly" reporting the error condition to other intelligent devices in the system. Since the programmable logic device is on a separate reset boundary, w...