Browse Prior Art Database

Correctly Finding a Failed Task in a Pre-Emptive Non-Time-Sliced Operating System

IP.com Disclosure Number: IPCOM000131858D
Publication Date: 2005-Nov-21
Document File: 5 page(s) / 21K

Publishing Venue

The IP.com Prior Art Database

Abstract

Background

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

DETECTING FAILED TASKS

Correctly Finding a Failed Task in a Pre-Emptive Non-Time-Sliced Operating System

Disclosed Anonymously

Background

There exists a problem which applies to pre-emptive (not purely time-sliced), multi-tasking operating systems. The solution disclosed herein applies to this type of OS.

 

A watchdog is a utility that monitors all of the current tasks to ensure functionality.  Each task will report or “kick” the watchdog during task-specific intervals of time to let the watchdog know that it is still running.  Timeout values, also known as reporting periods, vary from one task to another task. A task is expired when its timeout period has expired.

If the watchdog does not receive a signal or “kick” from a task within the task’s timeout value, the task has reported, indicating a failed task.

Some currently implemented operating systems use two mechanisms to ensure functionality of software:

 

  1. “Kicking” the watchdog so the device does not reset
  2. Having all the tasks report periodically to the watchdog

Note: the watchdog is the task with the highest priority; it does not report to itself.

1.  Technical Problem

The nature of the scheduling of a non-time-sliced OS will prevent other tasks from running if a task with a higher priority has failed to report.
Therefore, for debugging purposes, it is very useful to be able to precisely identify the task that failed to report to the watchdog.

A current implementation in an OS to find a timed-out task, which failed to report to the watchdog task, is reporting the wrong information.

The current implementation in an OS code looks randomly for a task that failed to report. It stops as soon as it finds one expired task and reports that as the failure task. This is incorrect, since it isn’t necessarily the first task that expired which failed. Since tasks are executed by order of priority, if a task expires, other lower-priority tasks will expire eventually also. Therefore, it may be any task with a higher priority, other than the first expired task, that is the failing task.

It became necessary to find a solution to efficiently find the correct failing task.

2.  Novel Aspect of Invention

Our new implementation was intended to precisely and optimally find a failing task by using an effective search algorithm. It uses elimination via top and bottom delimiters to limit the scope of search for the failing task.

3. An Overview Drawing

 

 

Figure 1: Tasks reporting to the watchdog task.

The watchdog is a task that monitors all tasks in the operating system.  All tasks will report, or “kick” the watchdog during task-specific intervals of time, to let the watchdog know that it is still running.  Timeout values, also known as reporting periods, vary from one task to another task. A task is expired when its timeout period has expired.

 


4. A High-Level Block Diagram

The nature of a pre-emptive, priority-based operating system will prevent other tasks with lower priority from running...