Conditional Execution - Allocating Instructions to Resource Instances with Less Ports than the Number of Instruction Inputs.
Original Publication Date: 2004-May-25
Included in the Prior Art Database: 2004-May-25
This document presents an efficient method for mapping conditional instructions onto resource instances which have less input ports than the number of input operands required by the conditional instructions.
Conditional Execution - Allocating Instructions to Resource Instances
with Less Ports than the Number of Instruction Inputs.
This document presents an efficient method for mapping conditional instructions
onto resource instances which have less input ports than the number of input
operands required by the conditional instructions.
The Reconfigurable Vector Streaming Processor(RSVP) is a vector coprocessor
architecture which accelerates streaming data processing. This architecture
does not support control-intensive applications. However, the
architecture contains a very small number of control instructions. One of them
is a conditional instruction which takes three inputs: A, B, and C.
This instruction basically implements the following C construct:
out = (c) ? a: b;
In one implementation of the RSVP architecture , the conditional instruction
is being mapped onto a 3-input logic unit. On a different proposed implementation
the conditional instruction will be mapped onto a general 2-input ALU. The ALU
is 196-bit wide, partitionable into twelve 16-bit "slices". The conditional
instruction gets its third operand, the decision/sel input, from one of eight
A or B ports in the top four slices of the ALU. This constraint is imposed in
order to reduce the number of bits required to encode all the instructions that
are mapped onto the ALU.
Since the conditional instruction "borrows" an input port from a different
"slice", the availability of the corresponding "slice" is being reduced. The
challenge is to assign input ports to the decision inputs of all the
conditional instructions scheduled at any given cycle in a way which minimizes
the number of unusable "slices".
A simple way of dealing with conditional instructions is to consider, for
scheduling purposes, that a conditional instruction requires more ALU-slices
than it really needs. For example, if inputs a and b are 32 bit wide, the
conditional instruction mapped onto the previously described ALU would require
three 16-bit "slices": 2 for the a and b inputs and one for the decision/sel
input. The extra slice is used to route the decision/sel input of the
conditional instruction. Such a solution leads to "waisting" one extra slice
for each conditional instruction.
The proposed solution separates the instructions mapped onto an ALU at any
given cycle into 4 categories: conditional instructions (CI),
unary instructions (UI), conditional compatible instructions (CCI), and other
instructions (OI). Unary instructions are instructions that require only one
input (i.e., saturate instruction, absolute instruction, etc.). Conditional
compatible instructions are instructions which have unbalanced inputs (i.e., one
operand is wider than the other one). The remaining
instructions are part of the other instruction category.
During the scheduling phase, as integral part of the proposed solution,
the conditional instructions are scheduled at the sam...