Browse Prior Art Database

Using Instruction Decode Cost and Instruction Scheduling on Intel 80386

IP.com Disclosure Number: IPCOM000116523D
Original Publication Date: 1995-Sep-01
Included in the Prior Art Database: 2005-Mar-30
Document File: 2 page(s) / 41K

Publishing Venue

IBM

Related People

Stoodley, KA: AUTHOR

Abstract

A method for improving performance on Intel* 80386DX* and 80386SX* microprocessors is disclosed. Instruction decode complexity is used as a cost/metric in the heuristics employed by the instruction scheduling phase of a compiler to reduce the cost of taken branches.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 81% of the total text.

Using Instruction Decode Cost and Instruction Scheduling on Intel
80386

      A method for improving performance on Intel* 80386DX* and
80386SX* microprocessors is disclosed.  Instruction decode complexity
is used as a cost/metric in the heuristics employed by the
instruction scheduling phase of a compiler to reduce the cost of
taken branches.

      The execution cost of branch instructions on the Intel 80386
family microprocessors is the sum of two components.  The fixed
component is determined by the exact type of processor and branch
being executed and cannot be affected by code generation choices
(given a particular branch instruction).  For example, conditional
branches on the 80386DX microprocessor have a fixed component of 7
clock cycles.  The variable component of the cost is determined by
the number of decode elements in the target instruction.  Every
decode element in the target instruction adds one clock to the cost
of the execution of the branch instruction.

      By adding support to the instruction scheduling phase of a
compiler for calculating the number of decode elements in every
instruction, it is possible to add functionality, so that when
considering which of the available ready instructions to schedule
after a label (such instructions would become the targets of branch
instructions), instructions with fewer decode elements are weighted
more favorably than those with more decode elements.  As a result the
branch execution times are re...