Browse Prior Art Database

Algorithm for 3-Bit Overlapped Scanning Multiplication

IP.com Disclosure Number: IPCOM000040561D
Original Publication Date: 1987-Dec-01
Included in the Prior Art Database: 2005-Feb-02
Document File: 3 page(s) / 53K

Publishing Venue

IBM

Related People

Grab, AV: AUTHOR [+3]

Abstract

Significant reductions in processing time and cost result from using two or more parallel chips for multiplying with the 3-bit overlapped scanning technique. The chips can be common part numbers and use concurrent decoding and addition with a minimal-width adder. Assuming a floating point 56-bit multiplier x and 56-bit multiplicand y with bit O as most significant and bit 55 as least significant, multiplier x is divided into bits (0-27) and (28-55) and multiplied with y for each circuit chip to provide respective high and low partial products of PPRH and PPRL. These partial products (Fig. 1) are each of 84 bits that are added after displacement of PPRL to the right 28 bit positions to give a result of 112 bits. Only the over (Image Omitted) lapping parts RB and RC need to be added, as shown in Fig. 2.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 52% of the total text.

Page 1 of 3

Algorithm for 3-Bit Overlapped Scanning Multiplication

Significant reductions in processing time and cost result from using two or more parallel chips for multiplying with the 3-bit overlapped scanning technique. The chips can be common part numbers and use concurrent decoding and addition with a minimal-width adder. Assuming a floating point 56-bit multiplier x and 56- bit multiplicand y with bit O as most significant and bit 55 as least significant, multiplier x is divided into bits (0-27) and (28-55) and multiplied with y for each circuit chip to provide respective high and low partial products of PPRH and PPRL. These partial products (Fig. 1) are each of 84 bits that are added after displacement of PPRL to the right 28 bit positions to give a result of 112 bits. Only the over

(Image Omitted)

lapping parts RB and RC need to be added, as shown in Fig. 2. A "hot 1" can be added to RA, if required. RD is in final form. Both chips execute the same iterative paths to reduce the multiplier. It requires half the number of cycles as compared to a single chip reduction; it also requires two floating-point cycles for the PPRL bus transfer and the addition of RB and RC. Three-bit overlapped scanning is used which reduces the two 28 bit multipliers to fourteen groups each, as shown in Fig. 3. If three decodes are considered at a time, five iterations for each chip are needed to retire the multiplier and thus twelve bits are reduced each iteration. In Fig. 3, the least significant bit, denoted by *, is forced to be zero and a zero (z) is appended in front. Although this step neglects multiplication by 1, a correction can be made by loading the multiplicand into the first intermediate partial product (IPP) register when loading the Y register. If the least significant bit (LSB) is a 1, the IPP is gated into the addition path during the first iteration, resulting in multiplication by one. If LSB is zero, then a zero is gated into the iteration path. This iterative multiplication decodes seven bits each time to generate three multiple terms that are added through carry save and carry propagate adders to obtain an intermediate partial product (IPP). The process repeats, adding the shifted IPP in with the multiple terms until a final partial product, PPRH or PPRL, is produced. The decoding and addition permit pipe-lining by doing them concurrently as two subiterations. A schematic diagram of a two chip embodiment for the overlapped scanning multiplication is shown in Figs. 4 and 5. The chips are quite similar and vary in that multiply low chip ML of Fig. 5 has logic to send PPRL to multiply high chip MH of Fig. 4, and MH has logic to add

(Image Omitted)

the partial products and to send results out. Decode Logic does the decoding in Fig. 5 of the multiplier XREG and produces the proper enabling signals for the seven bits during each decode subiteration. Multiplexers (Muxes) 4-6 shift or complement the multiplicand (YREG) according to the decode a...