Browse Prior Art Database

Compiler-Driven Hybrid Dynamic Branch Predictor

IP.com Disclosure Number: IPCOM000103847D
Original Publication Date: 1993-Feb-01
Included in the Prior Art Database: 2005-Mar-18
Document File: 4 page(s) / 122K

Publishing Venue

IBM

Related People

Chan, JC: AUTHOR [+3]

Abstract

The correlation-based branch prediction reported recently [1] provides a very attractive technique for designing an efficient branch predictor. Its accuracy is generally higher than that of the N-bit counter branch prediction [2] Figs. 1 and 2 show the implementation of the two schemes, respectively. The correlation scheme gives better prediction by taking into consideration the results of the correlated branches. However, its accuracy may be degraded by address conflicts introduced when a small table is used. It was found that not all the branches are correlated with some other branches and that the opcodes and the types (the encodes for the "BO-field" defined in the RIOS Instruction Set Architecture) of the correlated branches does not constitute the entire set of the architected branch opcodes and types.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 50% of the total text.

Compiler-Driven Hybrid Dynamic Branch Predictor

      The correlation-based branch prediction reported recently [1]
provides a very attractive technique for designing an efficient
branch predictor.  Its accuracy is generally higher than that of the
N-bit counter branch prediction [2]  Figs. 1 and 2 show the
implementation of the two schemes, respectively.  The correlation
scheme gives better prediction by taking into consideration the
results of the correlated branches.  However, its accuracy may be
degraded by address conflicts introduced when a small table is used.
It was found that not all the branches are correlated with some other
branches and that the opcodes and the types (the encodes for the
"BO-field" defined in the RIOS Instruction Set Architecture) of the
correlated branches does not constitute the entire set of the
architected branch opcodes and types.

      To reduce the number of unnecessary address conflicts, proposed
is a hybrid technique which combines both the correlation and the
counter branch predictions.  Generally, the new technique requires
the assis tance of the compiler during the compile time.  The
technique is de scribed below:

1.  Two tables are required.  As shown in Fig. 3, Table I implements
    the K-bit counter scheme, whereas Table II implements the (M, N)
    correlation scheme (refer to [1]  for the notation and the
    definition).  Note that an M-bit shift register is required for
    the correlation scheme.  This shift register keeps track the
    outcomes of the most recent M branches.

2.  A one-bit field, called the H-field (or H-bit), from each
    conditional branch instruction is required.  This bit is used to
    instruct the hardware to perform the preferred branch prediction:
    either the counter branch prediction or the correlation branch
    prediction.  The compiler is responsible for specifying the value

    for the H-field at the compile time (this will be discussed
    later).

3.  Both Tables I and II are accessed simultaneously using the same
    branch address under consideration.  Note that the access to
    Table II also requires the use of the shift register.

4.  The predicted outcomes from Tables I and II are selected based on
    the value specified in the H-field of the branch instruction in
    problem.

5.  The selected prediction serves as the final prediction for the
    branch.  The unselected prediction will be discarded.  Note that
    ONLY the selected predictor (either the counter-based or the
    correlation-based) will update its state after the branch is
    resolved.  The state of the unselected predictor remains
    unchanged.

6.  In either case, the shift register is updated accordingly.

There are three ways of specifying the value for the H-field:

1.  Per Address Based - Using Profile Information

    This method requires the use of profiles of previous program
    exec...