Browse Prior Art Database

Architecture for Vector Mask Registers

IP.com Disclosure Number: IPCOM000044459D
Original Publication Date: 1984-Dec-01
Included in the Prior Art Database: 2005-Feb-05
Document File: 2 page(s) / 15K

Publishing Venue

IBM

Related People

Agerwala, TKM: AUTHOR [+2]

Abstract

The new approach set forth below provides a significant improvement in the performance of a vector processor when an operation is to be performed only if a certain vector condition is satisfied. The improvement is even greater when the computation depends on compound conditions, if-then-else, nested if-then, or select-case constructs. The simplest problem for which this approach pertains is: Compute A If A satisfies some condition, then use A to compute something else Here, A is the vector register, and some elements satisfy the condition while others do not. Branching in scalar mode is clearly an inefficient way to code the example.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 2

Architecture for Vector Mask Registers

The new approach set forth below provides a significant improvement in the performance of a vector processor when an operation is to be performed only if a certain vector condition is satisfied. The improvement is even greater when the computation depends on compound conditions, if-then-else, nested if-then, or select-case constructs. The simplest problem for which this approach pertains is: Compute A If A satisfies some condition, then use A to compute something else Here, A is the vector register, and some elements satisfy the condition while others do not. Branching in scalar mode is clearly an inefficient way to code the example. It is assumed that the vector instruction architecture is: Computation: operation, mask, result operand, operand Comparison : operations, operand, operand During computation the specified operation is performed on the two operands and the result register is updated if the corresponding element in the mask is 1. The vector mask register associated with the result register is also updated to reflect the condition code of the result. In the standard approach, a vector mask register (VMR) is associated with every numeric register. Each mask register consists of N elements, one for every vector element, and 5 bit slices. The bit slices have the following meanings: Bit slice Description 0 Vector bit register (VBR) 1 Exception indicator 2 Result less than 3 Result greater than 4 Result equal Bit slices 1 through 4 are set during computations; bit slices 2 through 4, during comparisons. The vector bit register is the mask used to control the computations in the example above. It can be updated in the following four ways: Load vector bit register from memory Load vector bit register from register Perform logical operation on 2 VBRs Combine slices using a mask and store in VBR If a double-word data path is assumed, the first 3 operations can be performed in one machine cycle per double word, M = N/64 cycles; the last can be done in one of two ways. If each vector element in the vector mask register is treated independently, then N machine cycles are needed; if the vector mask register is treated as M double words, then M times the number of selected slices is the number of machine cycles needed. For example, if the VBR is set to 1 if the result is greater than or equal to zero, the mask would be 00011 and the result would be produced in 2M cycles. Although the latter approach takes less time to set the VBR, there are situations in which it prevents chaining of instruction. Since chaining is one of the key factors in producing high performance, the implementors may be forced to treat the vector mask register as N elements. The convention used is that specifying vector bit register 0 in the mask field is identical to setting all bits in the mask to 1. This convention makes it easy to update all elements of a vector, but it reduces the number of available mask registers. The...