Browse Prior Art Database

Method and System for Skipping Inactive Predicated Instructions in a Vector Processor Disclosure Number: IPCOM000239504D
Publication Date: 2014-Nov-12
Document File: 2 page(s) / 89K

Publishing Venue

The Prior Art Database


A method and system is disclosed for skipping inactive predicated instructions in a vector processor.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 52% of the total text.

Page 01 of 2

Method and System for Skipping Inactive Predicated Instructions in a Vector Processor

In a vector processor, memory accesses and arithmetic instructions can act upon full vectors or generally continuous portions of vectors, where vectors enable parallel processing on multiple data elements. For some operations, memory accesses or calculations for randomly distributed elements of the vectors may be immaterial . To reduce traffic through a memory hierarchy, and the associated energy consumed, by unnecessary memory accesses, the vector processors include vector mask register files for use in predication of instructions and as targets of comparison instructions . Predication of instructions allows for selective execution of certain elements of the vectors based on a mask. However, predication does not provide any performance gain. Those inactive predicated elements still take cycles. In the worst case, no operation is performed while consuming a number of cycles amounting to the number
of vector length.

Disclosed is a method and system for skipping inactive predicated instructions in the vector processor.

In an embodiment, the method and system utilizes multiple slices and chaining between them in VLIW architecture. Slices are functional units that often operate in a lockstep . Firstly, the method and system calculates a number of 1's in a mask vector. This is the total number of operations that can be performed , namely 'iteration count', which is set in 'iteration count register' indicating that a following vector operation repeats only by that amount. The method and system, then, starts calculating "position of the leading 1" with clear. Thereafter, the method and system feeds the result obtained to other slices preferably through chaining. The...