Browse Prior Art Database

Speculative Restore of History Buffer in a Microprocessor Disclosure Number: IPCOM000250357D
Publication Date: 2017-Jul-05
Document File: 5 page(s) / 1M

Publishing Venue

The Prior Art Database


Disclosed is a mechanism to speculatively process a ‘flush’ to improve recovery per-formance in a microprocessor. The mechanism speculatively reads out entries in the History Buffer (HB) to pre-load the HB recovery pipeline following the dispatch of a weakly predicted branch instruction. If the weakly predicted branch is mis-predicted, then the HB restore process can be significantly accelerated because the HB is specu-latively primed for restoring.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 31% of the total text.

Speculative Restore of History Buffer in a Microprocessor

In conventional processors, a History Buffer (HB) is a centralized component of the execution unit, which contains both speculative and architected register states. This component backs up the register file data when a new instruction is dispatched, in case the new instruction is flushed and the old data needs to be recovered.

Figure 1 illustrates this dispatch/eviction process, showing a processor map table supporting four registers. A history buffer with eight entries allows the processor to support speculative execution, where a branch-predictor predicts the direction of a branch, taken or not-taken, and speculatively execute instructions fetched from the predicted path. In this figure, a newly dispatched instruction with instruction-tag (itag) 0x0B produces a value stored in register two (reg 2) after it is issued and executed. The prior value of register 2 is evicted from the register file and needs to be stored in the history buffer, in case this new instruction is flushed. In this example, the history buffer stores all the information associated with register 2, and keeps it until instruction 0xB completes. Once the evictor (0x0B) has completed, there is no need to store the data produced by itag 0x0A, because 0x0B will not be flushed.

Figure 1: Centralized History Buffer storing evicted Register-File data

With a multi-slice processor architecture, where execution units are distributed across many slices (or lanes), a centralized HB structure is no longer feasible. The execution slices would need to be connected to communicate instruction results and process flush-recovery of registers. Such a centralized design requires an extensive number of ports and entries for either the register file, history buffer, or both structures. This can require an excessive amount of wires sent to the distributed execution slices, which would consume a significant amount of wiring resources on the processor die or add significant amounts of latency to register read and write operations.

An HB distributed across the execution slices reduces the cross-slice communication, but requires special considerations to recover the general-purpose registers (GPR) to the proper states. This consideration and arbitration may add latency to the register recovery mechanism. GPR data to be recovered from HB are marked, “Recovery Pending” (RP) and broadcasted to the GPRs located in the slices via the result buses of the execution unit. This

process is shown in below in Figure 2. Within each slice, recovery data is read from the HB, sent to the Reservation Station, and bypassed to the Execution Unit via one of the Operand Registers. The Execution Unit then places the recovery data from the Operand Register on its dedicated slice result bus. The data then travels from the result bus to every slice, updating all the GPR copies across the distributed mapper to write in the recovery data. All recovery-pending HB entries...