Browse Prior Art Database

Branch Buffering

IP.com Disclosure Number: IPCOM000104585D
Original Publication Date: 1993-May-01
Included in the Prior Art Database: 2005-Mar-19
Document File: 4 page(s) / 115K

Publishing Venue

IBM

Related People

Saintonge, DA: AUTHOR

Abstract

Branch History Tables (BHTs) are large arrays which hold information about recently encountered branches. This information is used to predict if the branch will be taken and where it will go. Since the BHT is quite large it is located away from the instruction fetching and prefetching logic which uses it. This distance causes a delay between instruction fetching sending the BHT a search address and the BHT responding. This delay can significantly impact the benefit provided by the BHT. Depending upon the machine in question, this delay has different effects. In the case where prefetching is delayed until a response is received from the BHT you will slow down the instruction fetching/decoding process.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 48% of the total text.

Branch Buffering

      Branch History Tables (BHTs) are large arrays which hold
information about recently encountered branches.  This information is
used to predict if the branch will be taken and where it will go.
Since the BHT is quite large it is located away from the instruction
fetching and prefetching logic which uses it.  This distance causes a
delay between instruction fetching sending the BHT a search address
and the BHT responding.  This delay can significantly impact the
benefit provided by the BHT.  Depending upon the machine in question,
this delay has different effects.  In the case where prefetching is
delayed until a response is received from the BHT you will slow down
the instruction fetching/decoding process.  In the case when the
branch is predicted without waiting for a response the pipeline will
be disrupted when the branch is predicted incorrectly.

      Since branches are usually encountered numerous times and their
behavior is generally repeatable the BHT gives us the benefit of
being able to predicted branches with a high degree of accuracy.
This predictability is especially true of loop controlling branches.
Unfortunately, on present processors, it has been seen that BHT isn't
always "quick enough."  The BHT can hold a considerable number of
branches but we pay for its size with access time.

      The layers of cache allow the most recently used data to be
kept closest to the processor.  This has proven to be quite
beneficial to system performance.  Proposed is a branch buffer.  This
buffer would have N entries.  The search of all N entries would be
done simultaneously.  It would be located close to the prefetching
logic.  This buffer would operate concurrently with the BHT.  When
the search address is sent to the BHT it would also be used to search
the branch buffer.  This is done because if the branch is not found
in the buffer it is not desired to even further slow the response
from the BHT.  If the branch information is found in the buffer it
would eliminate the delay of sending the search address to the BHT
and the return of the branch target information.

      When updating the branch information, the write should be sent
to the BHT and the branch buffer simultaneously, rather than wait
until an entry "ages" out of the buffer to install it into the BHT.
It is suggested that the LRU (least recently used) entry's location
in the branch buffer always be known, this would allow a new entry to
be installed quicker.  If the entry does not have to be "aged out" to
the BHT then this write can occur immediately.  This quick write is
important in tight branching situations.  Numerous cases have been
seen where a branch was not predicted correctly because of the delay
in installing it into the BHT.

      Depending upon the size of the implemented buffer the processor
could keep all or only a subset of the encountered branches in this
branch buffer.  For example, this would all...