Efficient Operating System Support for Floating Point Trapping on a Heavily Pipelined Hardware Architecture
Original Publication Date: 1991-Nov-01
Included in the Prior Art Database: 2005-Apr-04
Mealey, B: AUTHOR [+1]
Addressed is a growing industry problem of supporting precise floating point trapping efficiently, in heavily pipelined machines, such as the IBM RISC System/6000*.
Efficient Operating System Support for Floating Point
a Heavily Pipelined Hardware Architecture
a growing industry problem of supporting
precise floating point trapping efficiently, in heavily pipelined
machines, such as the IBM RISC System/6000*.
floating point exception, or trap, typically allows a
user program to be 'signalled', or notified upon the occurrence of
some event that the hardware is capable of detecting. This event is
typically either some sort of error condition, such as an unexpected
divide by zero, or a detected potential loss of accuracy on a
computation, such as an underflow or overflow.
non-pipelined hardware platforms, detecting
exceptional events (such as those previously described) and reporting
them at the 'precise instruction' at which they occurred was less
expensive in hardware. The reason for this is that in a non-pipelined
machine, or in a machine that dispatches only one instruction at a
time, and waits for it to complete, it is not too difficult to insert
logic to detect for an error condition, and report it immediately.
Take, for example, the case of detecting overflow on a floating point
operation. This non-pipelined, single-threaded machine will dispatch
the floating point operation, wait for it to complete, and upon
completion, check for overflow, and signal the operating system if
this is the case. There is very little latency, or overhead in this
checking above and beyond the machines' design point of dispatching
one instruction at a time and doing all processing on that
instruction before dispatching another instruction. It should be
noted that many machines do implement a performance improvement above
this by delaying the check for the exception until the 'next'
floating point operation is performed. That allows overlapping
floating point and fixed point operations.
The issue of
detecting floating point exceptions, and reporting
them precisely becomes an order of magnitude more difficult in
hardware that is heavily pipelined. The performance gain of checking
for the exception on the 'next' floating point operation is not good
enough in this case, because it still slows the machine down
substantially. To understand this, a brief discussion of pipelined
architecture is necessary. A heavily pipelined hardware architecture,
by its nature, makes the observation that the result of all hardware
instructions is not immediately needed by the next instruction
always. It is, therefore, possible to dispatch multiple instructions
at once, without explicitly waiting for any given instruction to
complete until (or unless) its result is needed. Ideally, the
instruction will have completely executed by the time the
instruction's result is needed, and there will be no waiting by the
processor. It is important to understand that the key design point of
pipelined hardware is to minimize wait-states by the CPU and to
execute all i...