Browse Prior Art Database

Branch Replacement Optimizations for Trace Directed Program Restructuring

IP.com Disclosure Number: IPCOM000106109D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 85K

Publishing Venue

IBM

Related People

Heisch, RR: AUTHOR [+2]

Abstract

This disclosure presents an optimization to the algorithm used when generating branch instruction sequences for basic blocks moved while performing a Trace Directed Program Restructuring (TDPR) on an existing executable program. the TDPR process groups highly executed instructions of a program together so as to minimize the working set (i.e., real memory) requirements as well as improving performance by increasing available cache space for a given executable program. In moving these instruction sequences or basic blocks however, the original control transfer target and fall through addresses must be adjusted to account for basic blocks that have already been moved or that are about to be moved so as to preserve original program functionality and in such a way as to take advantage of existing hardware branch prediction logic.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Branch Replacement Optimizations for Trace Directed Program Restructuring

      This disclosure presents an optimization to the algorithm used
when generating branch instruction sequences for basic blocks moved
while performing a Trace Directed Program Restructuring (TDPR) on an
existing executable program.  the TDPR process groups highly executed
instructions of a program together so as to minimize the working set
(i.e., real memory) requirements as well as improving performance by
increasing available cache space for a given executable program.  In
moving these instruction sequences or basic blocks however, the
original control transfer target and fall through addresses must be
adjusted to account for basic blocks that have already been moved or
that are about to be moved so as to preserve original program
functionality and in such a way as to take advantage of existing
hardware branch prediction logic.

The algorithm optimization is described as follows:

      For a machine that predicts branches to not be taken (such as
the RS/6000), the conditional branch instruction at the end of a
basic block is changed, if necessary and if possible, such that the
most probable path taken (target or fall through address) is placed
after the branch (at the fall through address) and the lesser taken
path specified as the branch target.  However, if the branch target
is not reachable from the current instruction or has not yet been
moved and may not be reachable, a branch is generated to a temporary
reachable address and an unconditional branch to the real target is
scheduled to be generated at some later time.  This eliminates the
sequence of a conditional branch (that will incorrectly predicted not
taken) around an unconditional branch to the target and provides an
optimum flow of instructions that does not product "bubbles" or
delays in the instruction pipeline.  It also reduces the number of
typically unexecuted instructions from inside highly executed "local
inner loops" which temporally frees cache space and improves
performance.

      Two cases for optimization are shown next.  Given the following
basic block (instructions IA through IB) terminated with a
conditional branch (BC) to some target address (T) that falls through
to address FT if the branch is not taken:

                IA
 ...