Browse Prior Art Database

Pre-emptive Cache Miss Processing

IP.com Disclosure Number: IPCOM000121894D
Original Publication Date: 1991-Oct-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 2 page(s) / 159K

Publishing Venue

IBM

Related People

Liu, L: AUTHOR

Abstract

Disclosed is a technique for optimizing performance of memory systems such that the transfer of a block of data may be pre-empted by another more urgent memory transfer. The benefit is the reduction of trailing- edge penalties due to bus contention.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Pre-emptive Cache Miss Processing

      Disclosed is a technique for optimizing performance of
memory systems such that the transfer of a block of data may be
pre-empted by another more urgent memory transfer.  The benefit is
the reduction of trailing- edge penalties due to bus contention.

      In computer systems a cache consists of a number of lines of
fixed size.  Upon a cache miss (i.e., when an accessed line is not
resident in cache), a copy of the missed line needs to be brought
into the cache from another level of storage hierarchy (e.g., second
level cache or main storage).  Such a line transfer normally takes
multiple cycles to complete due to limited bus bandwidth.  For
instance, a 256-byte line requires at least 16 cycles to transfer on
a fetch bus with quadword (16 byte) per cycle bandwidth.  While the
transfer is ongoing it will block any subsequent data/signal transfer
along the same bus.  Through trace analysis it has been observed that
cache misses tend to be clustered in time.  Furthermore, in more
aggressive system designs (e.g., with fast decode or with
prefetching) line fetch requests may be issued faster.  Hence, the
trailing-edge penalties due to bus contention will become more
visible in modern processors.

      One major reason for the trailing-edge effect is that
conventional designs perform line transfers in a sequential manner
for simplicity.  Such penalties may be reduced if we can provide
better priority schemes so that requests on demand can be satisfied
in a more timely fashion.  We will illustrate the new techniques with
examples.  Consider a processor P with its private first level cache
(L1), and a second level cache (L2) is used.  An L1 missed line will
be transferred from L2 when hits there.  An L2 miss will trigger
further operations to bring the line to L2 first. Furthermore, in
order to simplify our discussions, we assume that the L1 miss
requests (to L2) are always caused by demanded access from the
processor (i.e., we do not consider L1 misses due to anticipatory
accesses like prefetching). It is generally considered beneficial to
satisfy the demand requests as soon as possible.  We also assume the
well-known "bypass" technique in the design.  That is, upon a cache
miss, the data unit (e.g., quadword) that the processor issued and
triggered the miss will be transferred back (from L2) first, and the
rest of the units in the line are transferred back subsequently in
certain (e.g., rotating) order.  As soon as the...