Browse Prior Art Database

Sticky Bit Generation for IEEE Floating-Point Multiplication

IP.com Disclosure Number: IPCOM000039999D
Original Publication Date: 1987-Sep-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 4 page(s) / 70K

Publishing Venue

IBM

Related People

Keung, TW: AUTHOR

Abstract

This article describes a scheme to generate the sticky bit for floating-point (FP) multiplication. The sticky bit is generated for the IEEE standard of FP multiply operation, based on Booth's algorithm. It utilizes the existing carry lookahead (CLA) final adder, together with a minimum amount of logic, for the sticky-bit generation. Thus, a logic saving is achieved, as well as avoiding additional FP multiply execution cycles. The approach is adaptive to other multiply units without a final adder, through the addition of a spill adder. (Image Omitted) The multiply chip set is a uni-format dual-function multiply FP processor. All the input operands to the multiply chips are treated as uni-format FP numbers with a fraction of 64 bits (double extended format).

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 45% of the total text.

Page 1 of 4

Sticky Bit Generation for IEEE Floating-Point Multiplication

This article describes a scheme to generate the sticky bit for floating-point (FP) multiplication. The sticky bit is generated for the IEEE standard of FP multiply operation, based on Booth's algorithm. It utilizes the existing carry lookahead (CLA) final adder, together with a minimum amount of logic, for the sticky-bit generation. Thus, a logic saving is achieved, as well as avoiding additional FP multiply execution cycles. The approach is adaptive to other multiply units without a final adder, through the addition of a spill adder.

(Image Omitted)

The multiply chip set is a uni-format dual-function multiply FP processor. All the input operands to the multiply chips are treated as uni-format FP numbers with a fraction of 64 bits (double extended format). Inside the multiply chip set, there is a CLA adder to generate the intermediate product from the partial sum and carry of the final multiply iteration cycle. This adder will be idle during the multiply iteration cycles. The multiply chip set processor used the two-bit-shift version of Booth's algorithm for multiply operations. Three two-bit groups of the multiplier are scanned every iteration cycle from the least significant end. Three values of multiplicand are generated according to Booth's decode of the scanned multiplier bits.

(Image Omitted)

Whenever a negative decode occurs (-1X or -2X of the multiplicand), a twos complement of the multiplicand value is required. This is accomplished by generating a ones complement of the multiplicand, from the complement output of the multiplicand register and a 1 is inputted to the least significant bit of the carry save adder (CSA) of the next level, or becomes the least significant bit of the partial carry generated every iteration cycle. Referring to the diagram of Fig. 1, the three multiplicand values are fed to CSA 1. The twos complement adjust of multiplicand value one is fed to the least significant bit of CSA 2.

The twos complement adjust of multiplicand value two is fed to the least significant bit of CSA 3. The twos complement adjust of multiplicand value three will be the least significant bit of the partial carry generated every iteration cycle.

(Image Omitted)

The partial sum and carry of the previous iteration cycle, will right shift 6-bit positions and add together with the three multiplicand values to form a new set of partial sum and carry. The partial sum and carry are set to zero during the very first

1

Page 2 of 4

multiply iteration cycle. The sticky bit is generated in parallel, with the CSA generating the current partial sum and carry during the multiply iteration cycle. The least significant 6 bits of both partial sum and carry, a carry in (all from the previous iteration cycle), are fed to the least significant 6 bits of the CLA adder. A sum is generated, and is fed to the sticky-bit generate logic, shown in Fig. 2, with the sticky bit of the previous cy...