Browse Prior Art Database

Use of Verifiable Virtual Memory Accesses for Inter-Processor Surveillance

IP.com Disclosure Number: IPCOM000118827D
Original Publication Date: 1997-Jul-01
Included in the Prior Art Database: 2005-Apr-01
Document File: 2 page(s) / 118K

Publishing Venue

IBM

Related People

Hamilton, RA: AUTHOR [+4]

Abstract

Disclosed is a method for ascertaining continued performance of a primary computer process by a monitoring process. Rather than implementing explicit heartbeats to be used as a surveillance mechanism, the monitoring process watches registers, Non-Volatile Random Access Memory (NVRAM), and various other virtual memory address spaces for activity to determine the well-being of the primary process.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 49% of the total text.

Use of Verifiable Virtual Memory Accesses for Inter-Processor Surveillance

      Disclosed is a method for ascertaining continued performance
of a primary computer process by a monitoring process.  Rather than
implementing explicit heartbeats to be used as a surveillance
mechanism, the monitoring process watches registers, Non-Volatile
Random Access Memory (NVRAM), and various other virtual memory
address spaces for activity to determine the well-being of the
primary process.

      Given that surveillance is an increasingly desirable function
within computer systems, new methods of providing this capability are
constantly being sought out.  The implementation usually consists of
explicit "heartbeats" between processors, which one processor (or to
be more exact, one process) uses to ascertain the well-being of
another.  These heartbeats are signals sent back and forth, serving
the express function of telling a monitoring process that a critical
processor is still functional.  By this definition, a critical
processor must utilize computational resources to send the
intermittent signal.  The absence of a heartbeat, defined as a
predetermined interval  of time without a heartbeat, indicates that
an error has occurred either  in hardware or software.  Upon
occurrence of this event, the monitoring  process will typically take
some action.

      A problem which arises with the inclusion of surveillance is
that of overhead tasking.  As the primary processor sends out
heartbeats, it is no longer performing "useful" functions; it is no
longer contributing to the primary processes dictated by the user.
Thus, by inserting the safety precaution of surveillance, the
fundamental problem  is seen:  that of how to minimize tasking
overhead and maximize critical  processor throughput, while
establishing effective surveillance between  processors.

      The solution to minimizing overhead tasking centers upon
preserving critical processor efficiency while maintaining effective
surveillance.  This can best be accomplished by allowing the
monitoring processor to detect normal operations undertaken by the
critical processor and, further, allowing detection of these normal
events to constitute heartbeats, thus freeing the critical processor
from the task of sending out explicit heartbeat commands during
standard operation.  Specifically, this is accomplished by directing
the monitoring processor to watch activity occurring within shared
virtual memory space.  By taking this approach, no undue burden is
placed on the critical processor, and explicit heartbeats--the direct
signals which  incur extraneous overhead--are minimized.

      An embodiment of this methodology occurs in the interaction
between the RS/6000 Boot Firmware (on the primary processor) and the
RS/6000 Service Processor (the monitoring processor).  Although an
explicit heartbeat command, whose sole purpose is to ensure
surveillance compliance, is architecte...