Browse Prior Art Database

Method and apparatus for branch recovery in out of order dispatch and out of order retire instruction stream environment

IP.com Disclosure Number: IPCOM000125694D
Original Publication Date: 2005-Jun-13
Included in the Prior Art Database: 2005-Jun-13

Publishing Venue

IBM

Abstract

The key invention of the algorithms is that by handling the branch processing together with multiple instruction tag system such as TID and GID, the branching can be efficiently processed with 2 layers branch processor: speculative dispatch and execution, but saved current state before the speculative dispatch so the mis-predicted instruction can be flushed out, e.g. original states get restored.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 15% of the total text.

Page 1 of 12

Method and apparatus for branch recovery in out of order dispatch and out of order retire instruction stream environment

In the modern super scalar RISC CPU design environment, the biggest problem existed is the recover the mis-predicted branch direction since speculative dispatch may not be all right:

Block Diagram of the VISA processor

1

Page 2 of 12

In the modern super scalar RISC CPU design environment, the biggest problem existed is the recover the mis-predicted branch direction since speculative dispatch may not be all right:

Processor Core

Bus Interface Unit with wide Data Bus

Instruction fetcher

Rename buffer

Branch Processing Unit

Cond Reg

MSR

Mux /

Distributer

Instruction Reservation Station

Timer

Clock Multiplier/PLL

Register File with multiple input ports and multiple output port

Instruction Dispatch Unit With speculative out of order Dispatch IID and GID generation

Load Queue

Load/store Unit with Load buffer Store Buffer

Effective

Internal Bus

Execution Unit

Address Translation Unit for virtual address management

TLB

I-Cache D-Cache Management Tags

Reorder buffer

Block Diagram of the VISA processor

Scan chain and self- test vector generation co-processor JTAG compliance

Level 2 on Chip Data and Instruction Cache

Completion Unit entry reorder Buffer Retirement

Diagram 1: Typical super scalar CPU architecture

When an instruction stream is dispatched out of orderly and retired out of orderly, there must be a mechanism to maintain this double layer of 'disorder' back into original order so the resources and target of registers and memory conflict can be avoid to get the correct results.

Super scalar processor is designed to maximize the IPC (instruction per cycle) by coordinated design with compiler and processor core. Inside both of the integer and floating point execution

2

[This page contains 3 pictures or other non-text objects]

Page 3 of 12

Diagram 1: Typical super scalar CPU architecture

When an instruction stream is dispatched out of orderly and retired out of orderly, there must be a mechanism to maintain this double layer of 'disorder' back into original order so the resources and target of registers and memory conflict can be avoid to get the correct results.

Super scalar processor is designed to maximize the IPC (instruction per cycle) by coordinated design with compiler and processor core. Inside both of the integer and floating point execution unit and branch resolution unit, reservation station and rename buffer are extensively used to assistance the recover from wrong prediction of the speculative execution. The completion buffer is designed to further reorder the out of order dispatched instruction sequence to make sure the instruction retired in order according to its original logic sequences.

The division of the blocks inside the implementation is based on the functionality separation. Another consideration is processor internal bus boundary. With these consideration in mind, processor micro architectur...