Browse Prior Art Database

Method for a loop stream detector and playback device

IP.com Disclosure Number: IPCOM000016695D
Publication Date: 2003-Jul-09
Document File: 5 page(s) / 77K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a loop stream detector and playback device. Benefits include improved performance and improved power performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 28% of the total text.

Method for a loop stream detector and playback device

Disclosed is a method for a loop stream detector and playback device. Benefits include improved performance and improved power performance.

Background

              In a typical modern pipelined microprocessor, the unit fetching the instructions is higher in the pipeline than the unit decoding the instructions and higher than the unit executing the instructions. This arrangement results in a requirement for prediction of the program path so that  the instructions that are actually to be decoded and executed are fetched. The information for such a predictor is supplied by the execution units that identify the branches in the program that cause the changes in program flow. The devices control the prediction of fetches as branch predictors, even though the presence of branches is not known at this part of the pipe.

              Due to pipelining, higher clock speeds, and more complicated branch prediction algorithms, accurate prediction takes multiple cycles. Modern processors counter this timing problem by performing fast but not very accurate predictions, which are backed by a multicycle, accurate predictor.

              A very simple example is a pipeline where sequential fetches can follow each other every cycle. A taken branch (even if it results in a next sequential fetch) incurs a single cycle bubble. The next sequential fetch is the default predictor and is fast. Any taken branch results in a slower prediction.

              For programs with very tight loops (a small number of instructions in the loop body), the bubbles incurred from the taken branches can drastically decrease the bandwidth delivered to the decoders and execution units, resulting in a significant performance loss.

General description

              The disclosed method is a device that performs loop stream detection and playback. It functions within the branch predictor. The disclosed method overcomes the timing and performance problems of the conventional method.

              The key elements of the disclosed method include:

•             Loop stream detector

•             Playback device

Advantages

              The disclosed method provides advantages, including:

•             Improved performance due to eliminating the BPU taken branch dead cycle during playback

•             Improved performance due to removing the special slow predecode cases because the predecode information is saved

•             Improved performance due to using the interface between the ILDQ and decoders effectively

•             Improved power performance due to shutting off the ICache, BPU, and predecode circuitry

Detailed description

              The LSD-playback device is a fixed number of entries of information corresponding to the address that is being fetched and whatever look-up state the predictor uses (such as a global history). The purpose of the device is to detect a sequence of fetches that leads back to a particular state. The sequence of fetches produces an infinite...