Multiple Bypass Mechanism
Original Publication Date: 1991-May-01
Included in the Prior Art Database: 2005-Apr-02
Disclosed is a technique for extending conventional bypassing mechanisms for cache miss processing so that multiple units can be bypassed. The benefit is to improve memory access by CPU and reduce cache traffic.
Multiple Bypass Mechanism
a technique for extending conventional
bypassing mechanisms for cache miss processing so that multiple units
can be bypassed. The benefit is to improve memory access by CPU and
reduce cache traffic.
a classical technique for processor cache miss
handling. When a cache miss occurs for a fetch request by an
I/E-unit of processor the requesting unit will wait for the data unit
to return from lower level memory hierarchies (e.g., second level
cache or main storage). The size of a cache line (e.g., 64-128
bytes) is typically a multiple of the granule (e.g., doubleword) for
I/E-unit access. When the missed line is transferred back (to cache
unit), it is typical that the data unit causing the miss be
transferred back first and the rest of the data units are transferred
successively in sequential (wrapped to beginning of line past last
unit) manner. When the first data unit is back, it is also bypassed
to the requesting I/E-unit so that it can resume its operations
without having to reinitiate the fetch access to the cache.
conventional bypassing technique has been found
useful, it does not provide similar benefit for accesses to a
currently missed line past the one causing the miss. Due to
limitations on bus bandwidth it takes multiple (e.g., 4-16) cycles to
have the missed line transferred back. Similarly, it may take
multiple cache cycles to have the missed line put away into cache
arrays. The trailing-edge penalties caused by accesses to the missed
line after bypassing (of first unit) can be significant if such
accesses can only be satisfied when the miss processing is complete.
Design complexity on cache unit results if cache accessing to ongoing
missed line needs to be supported.
trailing-edge problem is primarily due to program
spatial locality. That is, data units physically close (or adjacent)
to each other tend to be accessed closely in time. Sequentiality is
a particularly strong characteristic. A possible approach to the
problem is to extend the conventional bypass facilities for
subsequent data accesses to a missed line.
depicts a simple processor organization. The
I/E-units (including the instruction and execution units) are
enhanced with a Bypass Buffer (BP). For simplicity of illustration
we assume that both the I/E-unit access to cache and memory fetch
(for cache miss) are at doubleword bandwidth (i.e., 8 bytes per
cycle). BP may then be structured as a FIFO stack of N (e.g., N=4)
entries. Each entry of BP carries a doubleword, plus other necessary
information. Upon cache miss processing, BP will hold certain
doublewords bypassed from the memory fetch. We also assume proper
mechanisms for I/E-unit to identify the doublewords (e.g., addresses)
buffered at BP. A simpler design of multiple bypass is as follows:
1. When a cache miss occurs, the cache unit issu...