Browse Prior Art Database

Superscalar microprocessor implementation having merged scalar and multimedia datapath

IP.com Disclosure Number: IPCOM000011582D
Original Publication Date: 2003-Mar-06
Included in the Prior Art Database: 2003-Mar-06
Document File: 1 page(s) / 40K

Publishing Venue

IBM

Abstract

According to the present invention, a processor has the ability to issue scalar and SIMD instructions to a datapath. In accordance with this invention, a processor contains an instruction issue stage which detects when a plurality of compatible instructions in the issue queue are ready and can be processed simultaneously within the lanes of a SIMD data path. According to the present invention, the instructions are issued simultaneously, and proceed in the pipeline as a single packet. In one embodiment, the instructions execute the same operation in all lanes. In another embodiment, the instructions are from a set which have common characteristics, such as latency, the number of inputs, and so forth, which make them amenable to simultaneous execution.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 95% of the total text.

Page 1 of 1

THIS COPY WAS MADE FROM AN INTERNAL IBM DOCUMENT AND NOT FROM THE PUBLISHED BOOK

YOR820010997 Louis J Percello/Watson/IBM Michael Gschwind, Valentina Salapura

  Superscalar microprocessor implementation having merged scalar and multimedia datapath

  According to the present invention, a processor has the ability to issue scalar and SIMD instructions to a datapath. In accordance with this invention, a processor contains an instruction issue stage which detects when a plurality of compatible scalar instructions in the issue queue are ready and can be processed simultaneously within the lanes of a SIMD data path.

According to the present invention, the scalar instructions are issued simultaneously, and proceed in the pipeline as a single packet. When an exception occurs in any one of the instructions having been bundled, all instructions are rejected, and execution is performed in accordance with prior art, i.e., each instruction is executed separately. In an optimized embodiment, only some instructions from a bundle are rejected. Instead, a reject mask is generated based upon information about the first instruction according to program order raising a synchronous exception within the bundle.

In one embodiment, the instructions execute the same operation in all lanes. In another embodiment, the instructions are from a set which have common characteristics, such as latency, the number of inputs, particular processing resources, and so forth, which make them amenable to s...