Browse Prior Art Database

Method for explicit trace cache instructions Disclosure Number: IPCOM000008934D
Publication Date: 2002-Jul-24
Document File: 3 page(s) / 61K

Publishing Venue

The Prior Art Database


Disclosed is a method for explicit trace cache instructions. Benefits include improved performance and improved power usage.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Method for explicit trace cache instructions

Disclosed is a method for explicit trace cache instructions. Benefits include improved performance and improved power usage.


              As processors continue to gain higher clock speeds and performance potential, the ability of memory systems to keep them sustained is increasingly taxed. Techniques such as adding more cache and prefetching can help, but cache is costly and prefetching is limited by memory bandwidth. Prefetching also frequently uses power and resources without being required. Explicit control of the trace cache can take advantage of increasing cache sizes by enabling them to be used more optimally.

      The first two decode stages within the instruction pipeline can only decode one instruction per clock cycle. Decoding is often the bottleneck within this pipeline.

              Conventional cache algorithms are limited because they are not optimized for specific applications. Instruction cache hit rates can be improved by giving the application developer more explicit control over the trace cache.

              The greatest challenge of processor design, increasing performance while decreasing power draw, would be achieved by positively affecting both sides. As the hit rate improves, so does performance. At the same time, limiting the amount of time that the decoder must be used reduces power usage.


              The disclosed method includes explicit instructions for trace cache. The method is based on a single decoder feeding the trace cache.

              The trace cache holds micro-operations (µops) that have been previously decoded from X86 software instructions. Like any cache, the goal is to achieve a high hit rate. In the case of the trace cache, a higher hit rate means using less time to decode new instructions and lowering power usage.

              Giving independent software vendors (ISVs) explicit instructions to decode and store instructions in the trace cache increases the hit rate by insuring that frequently executed traces are already decoded. The instructions that are decoded in this area are designated by software as crucial to performance.

              The processor vendor works with the ISVs to identify traces that are potential bottlenecks within code. Tools support identifying and implementing these instructions in the ISV’s code using profile-guided optimization with the compiler.

              Four new instructions are proposed for the processor’s instruction set to improve trace cache usage.

              Two instructions are added to the processor’s instruction set to bound the specified mops. The processor places the specified decoded mops into the semi-stati...