Browse Prior Art Database

Instruction Cache Block Touch Retro-Fitted onto Microprocessor

IP.com Disclosure Number: IPCOM000115873D
Original Publication Date: 1995-Jul-01
Included in the Prior Art Database: 2005-Mar-30
Document File: 4 page(s) / 214K

Publishing Venue

IBM

Related People

Funk, MR: AUTHOR [+2]

Abstract

An Instruction Cache Block Touch instruction can be added to the microprocessor architecture through the use of redundant encodes in existing branch instructions. Prudent compiler use of this instruction enables subsequently executed instruction stream to have been prefetched into the cache in parallel with the execution of instructions leading to these subsequent instructions.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 26% of the total text.

Instruction Cache Block Touch Retro-Fitted onto Microprocessor

      An Instruction Cache Block Touch instruction can be added to
the microprocessor architecture through the use of redundant encodes
in existing branch instructions.  Prudent compiler use of this
instruction enables subsequently executed instruction stream to have
been prefetched into the cache in parallel with the execution of
instructions leading to these subsequent instructions.

      Modern high performance processors support at least one level
of an instruction cache.  It is from the lowest level of instruction
caches that instructions are dispatched, often at rates of multiple
instructions per cycle.  When subsequent instructions are not found
in this cache, there is a delay while a block of instructions are
loaded from higher levels of cache or, failing there, directly from
main storage.  The delay can be as little a few cycles, as would be
the case for a cache hit in higher levels of the cache, or multiple
10s - 100s of cycles if the block of instructions must be accessed
from main storage.  However, with the faster cycle times of future
processors, even the time to load the lowest level of the cache will
take 10s of cycles.

      Instruction cache miss delays can account for a significant
percentage of the processor cycles while the processor is doing work.
In some transaction-based environments, this delay is estimated to
account for about 30% of the processing time.

      Some processors attempt to minimize this problem by
speculatively accessing subsequent cache lines.  When the speculation
is correct, this option saves a few of the delay cycles.  But when it
is not correct, the additional storage accesses can degrade
performance, especially in a multiple processor environment.

      If there were a way for software to inform the processor
hardware of the likely direction that subsequent instruction
processing will take and do it well ahead of the time when the
instructions will actually be needed, a significant portion of this
performance degradation can be avoided.

      A further desirable item is that compilers be able to generate
code for a number of processor design points, but Reduced Instruction
Set Computer/cycles (RISC) architectures typically do not allow
retrofitting additional instructions onto existing design points.

      Software informs hardware of the likely direction that
instruction processing will take via a new instruction to be called
Instruction Cache Block Touch (icbt).  This instruction should have
the following attributes:
  o  icbt should have the effect of initiating the preloading of the
      instruction cache.
  o  icbt should have an efficient means of specifying the address of
      the location which should be preloaded into the instruction
      cache.  That is, it should not take multiple instructions to
      access the location and it should use an existing add...