Browse Prior Art Database

Rounding IEEE Floating Point Results

IP.com Disclosure Number: IPCOM000044048D
Original Publication Date: 1984-Oct-01
Included in the Prior Art Database: 2005-Feb-05
Document File: 3 page(s) / 37K

Publishing Venue

IBM

Related People

Finney, D: AUTHOR [+2]

Abstract

This article describes an arrangement which reduces the number of processor cycles required to round intermediate results in floating point (FP). The IEEE (binary) FP architecture has several features that are quite different from IBM System/370 (hexadecimal) FP. One of these differences is that all operations defined by this architecture produce an intermediate result that can be regarded as infinitely precise. This number must then be delivered to its destination which is of finite length (single, double or double-extended format). After normalization or denormalization, if the infinitely precise intermediate result is not exactly representable in its destination format, then it must be rounded to the precision of the destination.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Rounding IEEE Floating Point Results

This article describes an arrangement which reduces the number of processor cycles required to round intermediate results in floating point (FP). The IEEE (binary) FP architecture has several features that are quite different from IBM System/370 (hexadecimal) FP. One of these differences is that all operations defined by this architecture produce an intermediate result that can be regarded as infinitely precise. This number must then be delivered to its destination which is of finite length (single, double or double-extended format). After normalization or denormalization, if the infinitely precise intermediate result is not exactly representable in its destination format, then it must be rounded to the precision of the destination. Four modes of rounding are provided which are user-selectable through bits 10 and 11 in the FP status word. These are encoded as follows: 00 - Round to nearest 01 - Round to zero 10 - Round towards + infinity 11 - Round towards - infinity. If Z is the infinitely precise intermediate result, and Z1 and Z2 are the next largest and next smallest numbers in the destination format that bound Z, then Z1 or Z2 can be used to approximate the result in the destination format when one of the following rules is used. Round to nearest - Choose the best approximation of Z1 or Z2. In the case of a tie, choose the one which is even (least significant bit 0). Round to zero - Choose the smaller in magnitude (Z1 or Z2). Round towards + infinity - Choose Z1. Round towards - infinity - Choose Z2. In order to implement this requirement, it is necessary to save data after it has been shifted out of the least significant bit (LSB). It is only necessary to save these bits to be able to generate a normalized and rounded result which appears as though all bits had been saved (infinitely precise). The standard implementation of these bits is to save the last bit shifted out as the guard bit (G), the next to last bit shifted out as the round bit (R), and the OR of all other bits shifted out as the sticky bit (S). The intermediate result Z is represented using these bits and must use them to generate a rounded result. To generate the next largest number Z1, a one must be added to the LSB and the G, R, and S bits truncated. The next smallest number Z2 is generated by simply truncating the G, R and S bits. Most microcoded processors require several cycles to do this rounding. First, they determine which rounding mode was being used. Then they test the appropriate conditions to determine whether Z1 or Z2 should be used as the rounded result. Finally, they generate either Z1 or Z2. Referring to the dr...