Browse Prior Art Database

Method for power reduction for CPUs through compiler-directed scheduling for asymmetrical functional units

IP.com Disclosure Number: IPCOM000006445D
Publication Date: 2002-Jan-03
Document File: 3 page(s) / 39K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for power reduction for CPUs through compiler-directed scheduling for asymmetrical functional units. Benefits include improved processing performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Method for power reduction for CPUs through compiler-directed scheduling for asymmetrical functional units

Disclosed is a method for power reduction for CPUs through compiler-directed scheduling for asymmetrical functional units. Benefits include improved processing performance.

Background

              Several trends are the key driving forces behind CPU design:

·        The amount of available instruction-level parallelism (ILP) is limited in most applications, including integer applications and transaction processing based server workloads. On the other hand, with shrinking feature sizes, a lot of silicon surface area is available. The computer architecture challenge for 0.1-μm technology and beyond is to find the most effective way to utilize this real estate within the constraints of limited ILP.

·        Power consumption is the biggest performance limiter for future CPUs, not silicon surface area. Every possible method for power reduction is required for the success of these designs. The interface between the architecture and the compiler is the biggest untapped source of power reduction.

·        CPUs increasingly rely on compilers to provide the most effective utilization of the available machine resources. This situation is especially true for Explicitly Parallel Instruction Computing (EPIC) architectures. The compiler provides the best leverage for the most performance and power-effective utilization of CPU resources.

              Conventional solutions implement multiple instances of a functional unit type, such as ALUs, using the same circuit style to get equal latencies. Due to limited ILP in applications, no ALUs can be fully utilized. Power is wasted in aggressive designs, such as the dual-rail domino for single-cycle latency in integer ALUs. The alternative of implementing all ALUs at lower power and longer delays is also not acceptable because it slows down all applications. The conventional solution overcomes these problems by providing a choice of delay/power for multiple ALUs and enabling the compiler to set the best power/performance optimized schedule.

Description

              The disclosed method saves power within CPUs through the aid of an architectural enhancement and associated compiler techniques. The method includes several key components:

·        Asymmetrical functional units

·        Instruction encoding hooks

·        Changes to the CPU front-end

·        Changes to the scheduler module of the compiler

              Modern CPUs have multiple instances of specific types of asymmetrical functional units. For example,...