Browse Prior Art Database

Full Accuracy Floating Point Rounding Operations by Pre-processing the Sticky Bits in Advance

IP.com Disclosure Number: IPCOM000171117D
Publication Date: 2008-May-29

Publishing Venue

The IP.com Prior Art Database

Abstract

Floating point operations are following IEEE 754 standard, such as ADDER, MULTIPLIER, DIVIDER, and SQUARE ROOT. The accuracies of the floating point result of these operations are based on the rounding approach. The most commonly used rounding method in the industry is "Round-to-the-nearest-even" method. The accuracies of this method are determined by the total number of internal bits that used to store the resulting mantissa data. These internal bits are made up of Rounding Bits and Sticky Bits (refer to section 2.2 for details). Normally, floating point operators will need to reserve a result space of 80-bit or more (≈32 internal bits in double precision mode) in order to process the required accuracy rounding. This approach will not only slow down the whole floating point operations, but it will also increase the redundant resources. In order to counter this lengthy rounding process, the contents of internal bits need to be predicted or pre-computed in advance, or using alternate method to generate it. Doing so can shorten the lengthy carry-chain or save up the redundant iterations.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 22% of the total text.

Page 1 of 11

Full Accuracy Floating Point Rounding Operations by Pre-processing the Sticky Bits in Advance

1.Introduction

    Floating point operations are following IEEE 754 standard, such as ADDER, MULTIPLIER, DIVIDER, and SQUARE ROOT. The accuracies of the floating point result of these operations are based on the rounding approach. The most commonly used rounding method in the industry is "Round- to-the-nearest-even" method. The accuracies of this method are determined by the total number of internal bits that used to store the resulting mantissa data. These internal bits are made up of Rounding Bits and Sticky Bits (refer to section 2.2 for details).

    Normally, floating point operators will need to reserve a result space of 80-bit or more (≈32 internal bits in double precision mode) in order to process the required accuracy rounding. This approach will not only slow down the whole floating point operations, but it will also increase the redundant resources.

    In order to counter this lengthy rounding process, the contents of internal bits need to be predicted or pre-computed in advance, or using alternate method to generate it. Doing so can shorten the lengthy carry-chain or save up the redundant iterations.

    The methods presented here are able to extract the resulting sticky bits without actual storing the whole mantissa result in internal bits. In most cases, the partial of rounding process can even be pre- processed or pre-extracted in advance, before the actual arithmetic operations taking place. The sticky bits can be pre-processed thru:
• FP_ADD_SUB - sticky bits look ahead
• FP_MULT - trailing zeros pre-counting
• FP_DIV, FP_SQRT - zero remainder checker

2.Background

2.1What is Round-to-nearest even method?

Let's says the mantissa result after a floating point operation process is as below:

Figure 1: Standard IEEE754 Floating-Point Number Format

1

[This page contains 12 pictures or other non-text objects]

Page 2 of 11

    Since only a result with smaller mantissa width is significant, the internal bits will be discarded. However, the contents of the internal bit might be large enough to be contributed as an extra 1 added to the mantissa, decision or algorithm is needed to determine when will be the +1 needed.

The decision of +1 rounding for round-to-nearest method is based on the truth table below:

Guard Bit  Round Bit  OR­chain of 

Sticky Bits  Rounding 

0 0 N/A  0  0 1 N/A  0  1 0 N/A  0  1 1 +1  1  0 0 N/A  1  0 1 N/A  1  1 0 +1  1  1 1 +1 

Table 1: Round-to-nearest Even Truth Table

Rounding = Round Bit AND (Guard Bit OR (Sticky Bits OR Chain))

2.2Round-to-nearest with full accuracy

    In order to have an accurate rounding, every single bit in the sticky bits chain must be considered. Thus, many registers are needed to store all sticky bits for accurate rounding. The more of the stic...