Browse Prior Art Database

High Performance Vector Register Clear

IP.com Disclosure Number: IPCOM000050649D
Original Publication Date: 1982-Nov-01
Included in the Prior Art Database: 2005-Feb-10
Document File: 3 page(s) / 58K

Publishing Venue

IBM

Related People

Watkins, GJ: AUTHOR

Abstract

In a vector processor, the elements of the vector operand(s) are usually held in a register for processing. This vector register can be any length and is specified as an operand in the instruction. For example, assume that there are two vector registers in use, each 128 elements in length. An instruction of the form VR1 = VR1 + VR2 would require that the processor take each element of VR1, add it to the corresponding element of VR2, and place the sum in VR1. This instruction, when given at the register level, has a total execution time which is a function of the register length or number of elements to be processed. In a processor with multiple vector registers, it is often required to "clear" (store zeroes) into these vector registers.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

High Performance Vector Register Clear

In a vector processor, the elements of the vector operand(s) are usually held in a register for processing. This vector register can be any length and is specified as an operand in the instruction. For example, assume that there are two vector registers in use, each 128 elements in length. An instruction of the form VR1 = VR1 + VR2 would require that the processor take each element of VR1, add it to the corresponding element of VR2, and place the sum in VR1. This instruction, when given at the register level, has a total execution time which is a function of the register length or number of elements to be processed. In a processor with multiple vector registers, it is often required to "clear" (store zeroes) into these vector registers. This can be time consuming and, for our example of a vector register with 128 elements, would require 128 cycles to "clear" the register. It is instead proposed that the "clear" instruction be simulated and/or postponed for as long as possible.

In the figure, the Vector Register (VR) buffer 10 is a random-access memory configuration of 8 vector registers. Each register is 128 elements in length. When a vector clear instruction ie executed, an 8-bit "mask" is passed to the control hardware. If the mask bit for a register is on, the Vector Register must be cleared. The operation, as shown in the figure, might proceed as follows. Assume the "clear" instruction being processed was to "clear" Vector Registers VR2 and VR6. The "mask" passed to the control would have bits 2 and 6 on and all other bits off. This mask configuration is sampled by the "set clear" line 12 which sets bits 2 and 6 of the Clear Register (Reg). The setting of the Clear Reg completes the execution of the "clear" instruction.

Following this, if VR6 (or VR2) is selected for use, the

"select 6" line samples the VR6 clear latch and finds it on. This signal is ORed with the selected other Clear...