Browse Prior Art Database

Scheme to Improve the Call/Return Performance on the RS/6000

IP.com Disclosure Number: IPCOM000105282D
Original Publication Date: 1993-Jul-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 127K

Publishing Venue

IBM

Related People

Cocke, J: AUTHOR [+3]

Abstract

A scheme to improve the performance of subroutine calls/returns in the IBM RS/6000* is proposed. The principal aim is to reduce the penalty associated with saving and restoring of the link register.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 43% of the total text.

Scheme to Improve the Call/Return Performance on the RS/6000

      A scheme to improve the performance of subroutine calls/returns
in the IBM RS/6000* is proposed.  The principal aim is to reduce the
penalty associated with saving and restoring of the link register.

      The linkage convention on the RS/6000 today implies a software
stack that is allocated at load time.  The size of the stack is under
control of an option flag during compilation.  A protected region is
added to the end of the stack (top of stack) such that a protection
exception is indicated when the stack overflows.  Each call to a
subroutine results in "buying a stack frame," unless the called
routine is a leaf.  The stack frame is used to save the necessary
general purpose registers, floating-point registers and special
purpose registers to recover the state after return from the
subroutine.  In addition, the stack frame may be used to save
automatic variables used inside the subroutine.

      It is proposed that the stack be split in two, so that one of
them, the link stack, contains only the link register contents that
need to be saved as part of the state information, while the other
contains the rest of the stack.  As before, the size of the link
register stack may be controlled by a compiler option flag.

      The present architecture is modified to define a new
non-privileged register, called the Link Address Register (LAR).
This register is designed to reflect the address of the top of the
link stack.  All instructions which use special-purpose registers,
e.g., mtspr, mfspr, also recognize the LAR.  The LAR is considered
part of the machine state and must be saved during context switches.
Three new instructions are proposed:
bst target         ;Branch and stack

      This instruction is an unconditional branch with the target
address being computed by adding the immediate field to the address
of the current instruction.  If the AA bit is set, the immediate
field is used as the branch target.  On executing this instruction,
the address of the instruction following the bst is pushed on the
link stack.  The LAR value is simultaneously bumped.

      The second new instruction is
ret         ;Return to address on top of stack; pop stack

      On a ret instruction, the LAR is bumped down and the contents
of the stack pointed to by LAR becomes the target address for the
branch.  A 5-bit parameter may be used with the ret instruction to
allow a return to occur through upto 31 levels of nesting.

      To speed up returns, a hardware stack of limited depth is
maintained.  A bst pushes the value that would have been sent to the
link stack into the hardware stack.  When the stack overflows, the
bottom of the stack is saved in the location corresponding to this
entry, LAR - d, where d is the depth of the stack.  A pop changes the
LAR value and fetches the target address from the hardware stack.  If
the har...