Browse Prior Art Database

Dynamic Thresholding of Sequentially Linked Errors

IP.com Disclosure Number: IPCOM000106611D
Original Publication Date: 1993-Dec-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 2 page(s) / 82K

Publishing Venue

IBM

Related People

Shieh, JM: AUTHOR

Abstract

Disclosed is an improvement to the error logging process. The error logging process is a two step method. The first portion of the error logging path occurs when an application program or kernel process logs an error to the error device driver. This logging is performed by sending the error device driver an error id. This error id is distinct to one type of error and the error has a corresponding predefined template. After the error device driver receives this error id, it then exits and alerts a sleeping error daemon that there is a new error id sitting in the error device driver's internal queue. In the second step, the awakened daemon goes to the device driver's internal queue, removes the error id, translates the error id to its appropriate error logging template and places this expanded data into the errorlog file.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Dynamic Thresholding of Sequentially Linked Errors

      Disclosed is an improvement to the error logging process.  The
error logging process is a two step method.  The first portion of the
error logging path occurs when an application program or kernel
process logs an error to the error device driver.  This logging is
performed by sending the error device driver an error id.  This error
id is distinct to one type of error and the error has a corresponding
predefined template.  After the error device driver receives this
error id, it then exits and alerts a sleeping error daemon that there
is a new error id sitting in the error device driver's internal
queue.  In the second step, the awakened daemon goes to the device
driver's internal queue, removes the error id, translates the error
id to its appropriate error logging template and places this expanded
data into the errorlog file.  This errorlog file is the one which the
user accesses when analyzing errors in the system.

      There is a problem with the present error logging method where
errors are being registered that shouldn't be considered errors.
These type of errors are caused as a side effect of a true error.  As
an example, one could look at the device drivers attached to a Small
Computer System Interface (SCSI) bus.  In the design of AIX*, if a
command is issued to a peripheral device from the SCSI device driver,
an internal timer is started.  At the end of that time period, if the
command to the device has not returned, the SCSI device driver
assumes that something is wrong with that peripheral device.  Then,
the SCSI device driver issues a SCSI bus reset.  This essentially
will cause the potentially hung device to reset itself.  The side
effect in this case is that the other devices that share the SCSI
device with the hung peripheral also experience their own reset.
When the devices that are not hung get the reset, they log this reset
to the error device driver.  There is no way that the innocent device
driver will realize that the true error is with another device, not
itself.  Additionally in the case where the resetting device is a
disk drive, the user doesn't see the effect of the disk resetting
itself.  This action is not detectable and the operating system
continues to run without interruption.  Thus, the disk device will
log a needless message to the error device driver stating that the
disk was reset.  The error l...