Browse Prior Art Database

Static Architectural Loop Timer Methodology

IP.com Disclosure Number: IPCOM000117759D
Original Publication Date: 1996-May-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 70K

Publishing Venue

IBM

Related People

Bose, P: AUTHOR

Abstract

Disclosed is new, table-driven Cycles-Per-Instruction (CPI) estimation methodology. This is used during early-stage microarchitecture design point definition and analysis for super scalar processors. A set of algorithms which implement the methodology, for static loop kernels, is briefly described and the overall program implementation is sketched.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 53% of the total text.

Static Architectural Loop Timer Methodology

      Disclosed is new, table-driven Cycles-Per-Instruction (CPI)
estimation methodology.  This is used during early-stage
microarchitecture design point definition and analysis for super
scalar processors.  A set of algorithms which implement the
methodology, for static loop kernels, is briefly described and the
overall program implementation is sketched.

There are three main facets of this invention:
  1.  The idea of modeling the instruction set architecture at a
       reduced, higher-level abstraction derived systematically from
the
       actual (full) architecture specification and the early-stage
       execution semantics specification.
  2.  The formula-based (analytical, table look-up) method for
       estimating execution cost and steady-state CPI from a linear
       macro code sequence with a single loop-ending branch.
  3.  The ability to deal with a parametrized hardware model giving
       rise to a "programmable" estimation methodology.

      The overall method and process of evaluating the execution time
bound T for a given (mapped) n-iteration loop trace is summarized as
follows:
  1.  Calculate the data dependence initiation (penalty) matrix
       P(i, j), for each ordered pair of instructions, (i, j),
       for i+1 <= j <= i+d-1, where d is the issue-width of the super
       scalar processor.
  2.  Calculate the dependence-free issue-order count (IC), for each
       instruction in the given loop trace, using procedure
ic_compute
       (see below), invoking it as ic_compute(1, L), where L is the
       length of the loop trace.
  3.  Invoke procedure ic_update (see below) to add
dependence-removing
       null operat...