Browse Prior Art Database

High Speed Multiply using Four Input Carry Save Adder

IP.com Disclosure Number: IPCOM000080332D
Original Publication Date: 1973-Dec-01
Included in the Prior Art Database: 2005-Feb-27
Document File: 2 page(s) / 49K

Publishing Venue

IBM

Related People

Larson, RH: AUTHOR

Abstract

By looking at two adjacent positions, two words out (effectively two adds) in one stage are possible. However, as the high-order output bits are fairly complex functions of all eight input bits, monolithic array logic (MAL) is more feasible than standard logic.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 56% of the total text.

Page 1 of 2

High Speed Multiply using Four Input Carry Save Adder

By looking at two adjacent positions, two words out (effectively two adds) in one stage are possible. However, as the high-order output bits are fairly complex functions of all eight input bits, monolithic array logic (MAL) is more feasible than standard logic.

An exemplary MAL module has four outputs which can be made arbitrary functions of eight inputs. Another input gates the outputs so a number of modules may be dotted together.

One high-speed multiply constructed of this device is shown in Fig. 1. Consider a cycle to be the time through an MAL level and a latch. After setting the multiplicand (MCD) in its register, the multiplier (M) digits are presented, one per cycle, starting with the low order until the M is entirely retired. At the end of every cycle a multiple is at the output of carry save adder CS1. At the end of cycle 2, one digit is presented to the 4-bit carry propagate adder CP, and the other digits of partial product are available to be wrapped around and added to the second multiple. A possible speed-up is a 0 detect on the unretired M and the carry output of CS2. If all 0, the remaining product can be read broadside.

CS2 is a simple 4-input adder and may be implemented with two arrays for every digit of width. (This may have to be doubled for parity predict.) If CS1 is implemented, as shown, it would cost four arrays per digit for data and two for parity. Combining the input shifting and ga...