Browse Prior Art Database

Method and Apparatus for Predicted Prefetch Suppression

IP.com Disclosure Number: IPCOM000199675D
Publication Date: 2010-Sep-14
Document File: 2 page(s) / 41K

Publishing Venue

The IP.com Prior Art Database

Abstract

It has been shown that about 1 out of every 5 instructions is a branch. As such, it is very common to see a taken branch redirect to its target stream, and then within a few instructions along the target stream to see another taken branch that redirects to yet another stream. In many such cases, all instructions along the first branch's target stream (including the later taken branch) reside on the same cache line. Given a system with an asynchronous branch prediction system, it happens that the fetching logic can be fetching along a sequential stream beyond the point of a potentially predicted taken branch. This extraneous fetching can lead to various inefficiencies. Firstly, if these extra fetches are made to a different cache line than that of the taken branch, and if that new line is not in the cache, the cache may be forced to allocate its resources to bring the new line in. Not only will this tie up resources and waste power, but bringing in this new and unnecessary line could potentially pollute the cache by overwriting a different line that is needed. Secondly, these extra fetches are using fetch resources that could be allocated to other more necessary streams. To deal with these inefficiencies, it would be beneficial to know whether or not a stream changing branch is within the same cache line as the currently executing instruction stream. If it turns out to be in the same cache line, fetching to the next sequential cache line can be avoided. One such scheme used to deal with this problem involves real-time checking that a newly predicted branch is within the same cache line of the currently executing stream. The problem with this method, is that within an asynchronous branch prediction system, it is possible that fetching will actually be ahead of branch prediction. As such, when the branch is predicted and the detection is completed, fetches may have already been made to the sequential cache line.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 51% of the total text.

Page 1 of 2

Method and Apparatus for Predicted Prefetch Suppression

This type of system can be broken down into three subsystems: detection, prediction, and fetch suppression and recovery.

  Detection - When a taken branch (branch A) is predicted, the IA-reg associated with the target instruction is saved off. When the next taken branch (branch B) is seen, the IA-reg associated with branch B is compared to the saved off IA-reg associated with branch A's target instruction. If the IA-regs match it must mean that branch B resides within the same cache line as branch A's target instruction. Similarly, if the IA-regs do not match, it must be that branch B does not reside within the same cache line as branch A's target. If branch A has not completed, its "anti prefetch" is updated for future predictions. If the branch has completed with no update, then the "anti prefetch" status of that branch remains unchanged.

    Prediction - When a taken branch completes with a pending update to its "anti prefetch" status (either set or reset), the BTB entry of the branch should be updated to reflect the change. When that branch is predicted again, the "anti prefetch" information should be transmitted to the fetch logic.

    Fetch Suppression and Recovery - When a taken branch is predicted with a set "anti prefetch" marking, fetching to the branch target's sequential cache line should be suppressed. This suppression is performed within the Super Basic Block Buffer, or SBBB. This special buffer, and there may be multiple such SBBBs in the design, not only stages the instruction text received from the cache, but also steps through the data, picking out instruction length codes that it then uses to parse the instruction text into individual opcodes. Within each SBBB, there are two local IA-regs that correspond to the either one stream with two sequential cache lines, or two separate streams each with one cache line. When a taken branch is predicted, the target stream is init...