Browse Prior Art Database

Method for a recovery and synchronization mechanism for loop prediction using a branch count table

IP.com Disclosure Number: IPCOM000127435D
Publication Date: 2005-Aug-30
Document File: 7 page(s) / 78K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a recovery and synchronization mechanism for loop prediction using a branch count table (BCT). Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 21% of the total text.

Method for a recovery and synchronization mechanism for loop prediction using a branch count table

Disclosed is a method for a recovery and synchronization mechanism for loop prediction using a branch count table (BCT). Benefits include improved functionality and improved performance.

Background

              Most branch predictors update the prediction state based on the resolved direction of previous branches. A pipelined processor may mispredict the same branch several times before updating for the first occurrence. Only the first misprediction has a measurable performance impact because mispredictions in the mispredicted path (bogus instructions) are thrown away. The predictor should ideally be updated with all branches up to and including the mispredicted branch. The prediction state should be restored to remove the effects of any bogus instructions, and the instruction flow should be redirected down the correct path.

              In a deeply-pipelined out-of-order processor, the predictor update can occur many cycles after the instruction flow has been redirected. A branch may mispredict a second time because it did not receive the update for the first misprediction. For a bimodal predictor, this late update is likely to result in an extra misprediction or two before it enters a strong state. The global predictor can be similarly affected before reaching a strong state. If the global predictor uses bimodal mispredictions in its allocation policy, the second misprediction can also result in unnecessary global allocations. This effect can be minimized with a large global array or a synchronization method.

              For predictors that rely on machine state, recovery following a misprediction is very important. The prediction state is essentially global and can be represented using a feasible number of bits. Restoring the predictor state is expensive. The global branch history (global stew) may contain any number of bits (based on performance/cost), and must be saved for each branch that may eventually mispredict, requiring a large structure. In the same way, the restore state buffer (RSB) requires few bits to restore its state but is costly when multiplied by each branch.

              Restoration for several other predictor types using this method becomes completely impractical. A local predictor’s state is a pattern for each branch in the local array. Saving the entire local prediction state is not only unrealistic in terms of space but would waste many cycles to look up and replace each erroneous entry. Similarly, repairing a loop predictor would necessitate restoring the iteration count for each branch that may have been incorrectly updated.

              Some predictors require knowledge known only at retirement, such as mispredictions or actual branch direction. For predictors that rely on back-end feedback to originate their speculative state, synchronization between prediction and update is essential. New predictors, such as the stew-base...