Browse Prior Art Database

Method for reducing communications in a clustered processor by means of occupancy-aware steering

IP.com Disclosure Number: IPCOM000127017D
Publication Date: 2005-Aug-17
Document File: 4 page(s) / 71K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for reducing communications in a clustered processor by means of occupancy-aware steering. Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 39% of the total text.

Method for reducing communications in a clustered processor by means of occupancy-aware steering

Disclosed is a method for reducing communications in a clustered processor by means of occupancy-aware steering. Benefits include improved functionality and improved performance.

Background

      Clustered microarchitectures are a key paradigm for next generation processors because they effectively deal with some key challenges, such as wire delays, power density, and temperature distribution. One of the major drawbacks of these architectures is the cost of communicating values from one cluster to another cluster.

      One of the key aspects for the performance of clustered processors is the latency for communicating register values among clusters. The steering engine (placed in the dispatch stage) is in charge of determining the destination cluster of each instruction. Typically, this engine attempts to minimize communications by steering an instruction to the cluster that holds most of the inputs. If the instruction cannot be steered to the most preferred cluster, the dispatch of instructions is stalled. This approach causes a ~30% performance degradation when compared to a method that probes the rest of the clusters. However, instructions can be steered to another cluster when the optimum cluster cannot receive any more instructions that cycle. In that case, the number of communications are increased because not-optimum clusters may not hold any of the inputs.

General description

              The disclosed method includes a steering technique that reduces intercluster register communications. The eligible clusters are limited to those that minimize communications when the backend is almost full. Communication is reduced and performance is improved.

              The key elements of the method include:

•             Mechanism to determine the number of communications required when the instruction is steered to a particular cluster

•             Mechanism to identify the load of the different queues and schedulers in the backends

•             Clustered backends with schedulers and execution units

•             Front-end with mechanism to identify if a cluster has a valid copy of each logical register

•             Independent renaming of input instruction registers for each cluster, using the Register Rename Table

•             Special COPY instruction for distributing register values among clusters

•             Sorting of destination clusters for an instruction according to the steering criteria and stalling of the dispatch stage only when the backends are full and the destination cluster is not the preferred one

Advantages

              The disclosed method provides advantages, including:

•             Improved functionality due to providing the Register Rename Table

•             Improved functionality due to providing the COPY instruction for distributing register values among clusters

•             Improved performance due to reducing the total n...