System and Method for Achieving Out-of-Order Execution of Instructions in Simultaneously-Multithreaded (SMT) Processor Cores
Publication Date: 2010-Sep-20
The IP.com Prior Art Database
A system and method is provided to issue and execute out-of-order instructions in simultaneously-multithreaded (SMT) processor cores. A basic block is split into multiple register threads. Subsequent to the splitting, ambiguous memory references of the basic block are retained in program order within a single register thread. Thereafter, a checkpointing mechanism is used to issue and execute out-of-order instructions.
System and Method for Achieving Out -of-Order Execution of Instructions in
Simultaneously-Multithreaded (SMT) Processor Cores
Disclosed is a method and system for issuing and executing out-of-order instructions in simultaneously-multithreaded (SMT) processor cores.
Typically multi-core processors incorporate SMT cores. Each of the SMT cores issue simultaneous instructions from multiple threads during a clock cycle. Every thread has
a dedicated register file in the SMT core. However, most other control and data paths of the SMT core are shared by all the threads. An SMT core simultaneously executes threads that are independent of each other or in communication with each other via a shared memory. The SMT cores implemented in the disclosed method and system use single, in-order instruction issue and in-order instruction execution/completion. This eliminates the complexities introduced by renaming a register.
The method and system involves splitting a basic block into multiple register threads
using a compiler based on instruction scheduling. Register threads are threads that work in the same physical register file, unlike traditional SMT threads that have their own individual register files. In an SMT processor, each individual register file is identified by a thread-ID prefix. The thread-ID prefix is used for indexing the basic register threads into a shared large physical register file. Two register threads may be given the same thread-ID prefix for indexing into the shared large physical register file.
The splitting of the basic block is enabled by handling inter-register-thread dependencies, including register based dependencies and memory based dependencies. Further, even after the splitting, the basic block's ambiguous memory references are retained in program order within a single register thread.
Registers used by a basic block may be classified by a compiler into live-on-entry registers, live-on-exit registers, and local registers. Live-on-entry registers are produced outside the basic block and read within the basic block, whereas live-on-exit registers
are produced by the basic block and used outside the basic block. The registers that are produced and consumed entirely within a basic block are termed local registers. Local registers do not have consumers outside the basic block. The compiler then assigns each local register instance to an appropriate register of the local register file, essentially doing static renaming of the local register instances. However, the global registers, including live-on-entry registers and live-on-exit registers, are not acted upon.
The classification of the registers refers to each instance of an architectural register or physical register that the basic block's registers would conceptually get renamed to during execution. The basic blocks are typically short and therefore require local registers small enough to fit into the physical register file of a single SMT thread. Howeve...