Browse Prior Art Database

Automatic Program Reordering for Data References in Unified Cache

IP.com Disclosure Number: IPCOM000117606D
Original Publication Date: 1996-Apr-01
Included in the Prior Art Database: 2005-Mar-31
Document File: 2 page(s) / 103K

Publishing Venue

IBM

Related People

Heisch, RR: AUTHOR

Abstract

The technique of profile-based code repositioning or Feedback Directed Program Restructuring (FDPR) is a well known method for reorganizing the instructions in a program, based on an address trace or execution profile of the actual behavior of that program, so as to improve the utilization of the instruction cache (which typically results in improved performance) as well as the utilization of real memory (which typically results in a reduction in real memory requirements). The FDPR product for AIX* has provided speedups of up to 73% (typically in the 10-20% range) and text memory usage reductions of up to 61% (typically in the 20-30% range) for instruction reordering.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Automatic Program Reordering for Data References in Unified Cache

      The technique of profile-based code repositioning or Feedback
Directed Program Restructuring (FDPR) is a well known method for
reorganizing the instructions in a program, based on an address trace
or execution profile of the actual behavior of that program, so as to
improve the utilization of the instruction cache (which typically
results in improved performance) as well as the utilization of real
memory (which typically results in a reduction in real memory
requirements).  The FDPR product for AIX* has provided speedups of up
to 73% (typically in the 10-20% range) and text memory usage
reductions of up to 61% (typically in the 20-30% range) for
instruction reordering.

      Profile-based data reordering may offer similar improvements,
particularly for a unified set-associative cache, although less work
has been done in this area (a more difficult problem especially when
attempted at the executable level).  No known postprocessing tools
attempt data reordering due to the difficulty in updating all
references to moved data.

      This disclosure presents a method and algorithm for modifying
the FDPR product which effectively achieves data reordering without
moving any data and further optimizes performance in a unified,
set-associative cache environment (such as the PowerPC* 601)
utilizing execution profile feedback.

      The idea is to collect not only runtime instruction reference
patterns (as FDPR does today) but to also include data reference
patterns for a selected workload(s) and to then use that profile data
to logically partition for a unified cache and assign instruction and
data object addresses accordingly so as to eliminate or minimize
congruence class collisions (conflict misses) which occur between
instruction fetches and data references.

      The FDPR tool would then automatically produce a reordered
executable with instructions placed in congruence classes which
minimize conflict misses with data references for a specific unified
cache organization.

      Application of this method at the executable level can be
achieved by maintaining existing data object locations (i.e., data
reordering is not required), partitioning the cache based on "hot"
data reference patterns, and then assigning memory addresses to
non-conflicting cache congruence classes for associated instruction
fetches (in other words, for a given "hot" execution sequence,
determine the "used" locations in the cache for hot data references
and then automatically assign the remaining or less used cache
locations to instructions).  FDPR would then produce a new,
instruction reordered version of the executable with instruction/data
reference conflicts minimized (or possibly eliminated).  The result
would be improved program performance due to improved unified cache
...