Browse Prior Art Database

Memory Recovery Facility for Computer Systems

IP.com Disclosure Number: IPCOM000107624D
Original Publication Date: 1992-Mar-01
Included in the Prior Art Database: 2005-Mar-22
Document File: 2 page(s) / 114K

Publishing Venue

IBM

Related People

Huynh, DQ: AUTHOR [+2]

Abstract

Described is a software maintenance facility that provides memory recovery for computer systems. The facility is aimed primarily at improving the performance of fault-tolerant systems, which utilize many check points, by optimizing the insertion of recovery points.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Memory Recovery Facility for Computer Systems

       Described is a software maintenance facility that
provides memory recovery for computer systems.  The facility is aimed
primarily at improving the performance of fault-tolerant systems,
which utilize many check points, by optimizing the insertion of
recovery points.

      In computers equipped with fault-tolerant systems, a
maintenance processor (MP) is used as a functionally and physically
separate monitoring computer.  The MP probes the operation of a host
processor (HP) in real-time (1) so as to assure that the functional
performance of the HP does not deviate from the behavior specified by
its design and by the program being executed.  Generally, a process
back-up and checkpoint, also known as backup error recovery or
rollback recovery, is used by the MP to recover from either transient
or hardware faults.  The process assumes determinism in process
execution where two processes will yield identical outputs, given
that they have identical initial state and inputs.  However, small
computer systems generally do not incorporate fault-tolerant
facilities.

      Typically, problems raised by the process back-up and
checkpoint recovery facility involve the integrity of the memory in
the event of a rollback.  For example, in a sequence of read then
write operations, the memory reference to the same address does not
yield the same result.  In order to undo the effects of the
instructions, several facilities of restoring the memory contents
have been proposed, as follows:
      In prior art (2), read and write instructions addressing the
same variables were separated by a rollback point which resulted in
an increase of the system overhead.  In (3), the proposal was made to
lengthen the repeatable sequence by storing the global variables of
the application in local program variables with a sequence
terminating with updating the global variables.  This facility,
however, depends on the programmer to define the global variables
which must be copied.  In (4), the proposal was made to consider
slowing the storage operations long enough to read and save in a
stack the contents of the memory location to be modified.  However, a
disadvantage to this facility is the decrease in system performance,
again due to overhead.

      So as to avoid the drawbacks of the aforementioned facility,
the concept described herein utilizes a recovery facility which
optimizes the insertion of the recovery points.  The MP first
allocates a read and write internal buffer to each HP task.  During
run time, the MP takes two actions.  For a memory read reference, the
MP saves both the address and the contents of the first occ...