Browse Prior Art Database

Half-Cycle Branch Folding within an Instruction Buffer

IP.com Disclosure Number: IPCOM000115673D
Original Publication Date: 1995-Jun-01
Included in the Prior Art Database: 2005-Mar-30
Document File: 2 page(s) / 68K

Publishing Venue

IBM

Related People

Burgess, B: AUTHOR [+2]

Abstract

Super-Scalar processors seek to detect and remove branches from the instruction stream in order to maximize overall processor performance. These branches are often removed (or folded) during the dispatch cycle of the processor. This is achieved by looking several instructions ahead of the current dispatch position. Unfortunately, substantial hardware is required in order to scan ahead of the current dispatch position looking for branches.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Half-Cycle Branch Folding within an Instruction Buffer

      Super-Scalar processors seek to detect and remove branches from
the instruction stream in order to maximize overall processor
performance.  These branches are often removed (or folded) during the
dispatch cycle of the processor.  This is achieved by looking several
instructions ahead of the current dispatch position.  Unfortunately,
substantial hardware is required in order to scan ahead of the
current dispatch position looking for branches.

      For the purposes of this discussion, one can assume that there
are four instruction buffers as shown in the Figure.  Two
instructions can be fetched from the instruction cache each cycle and
two instructions can be dispatched to some number of functional units
each cycle.  Each instruction fetched from the icache is either a
branch (br) or a non-branch (notbr) instruction.  The following
branch cases arise for the instr0, instr1 busses shown in the Figure
for each cache access:
        instr0  instr1
        --------------
     case1:   notbr   notbr
     case2:   notbr   br
     case3:   br      br
     case4:   br      notbr

      The instruction buffer maintains the instructions fetched from
the cache in program order.  For example, if the instruction buffer
has two instructions and fetches an additional two instructions from
the cache, instr0 would be placed in buffer b2 and instr1 in buffer
b3 as shown in the Figure.

      Branches are folded one-cycle after they are fetched from the
instruction cache.  The folding deallocates the buffer positions
occupied by the branch and all...