Browse Prior Art Database

Accessing Incomplete Cache Lines

IP.com Disclosure Number: IPCOM000120624D
Original Publication Date: 1991-May-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 3 page(s) / 132K

Publishing Venue

IBM

Related People

Liu, L: AUTHOR

Abstract

Disclosed is a technique for accessing a cache line that is partially filled during cache miss processing.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 49% of the total text.

Accessing Incomplete Cache Lines

      Disclosed is a technique for accessing a cache line that
is partially filled during cache miss processing.

      Caches have been widely used in processor designs as fast
buffer for recently used data lines.  A cache miss occurs when an
accessed line is not resident in the cache. Upon a cache miss, the
cache control normally requests a copy of the line from other memory
hierarchies (e.g., second level cache or main memory).  In
conventional designs the missed line becomes accessible at the cache
array when it is completely fetched.  One problem associated with
cache miss processing is the so-called trailing-edge penalties.
Consider a design in which each cache line is 128 bytes long, and the
cache line fetch bandwidth is 16 bytes per cycle.  It takes (at
least) 8 cycles to fill up the line when the missed data is
transferred back.  Although the bypass method has been used to allow
the unit (e.g., doubleword) actually requested by the I/E-unit to be
transferred back first and resume execution, subsequent I/E-unit
accesses are likely to target for the same line soon before the whole
miss fetching is completed. Performance loss will result if such
subsequent accesses to the line cannot be granted until the miss
processing is complete.

      This invention proposes mechanisms for accessing incomplete
lines.  That is, when a cache line is being filled (upon a line
fetch), we allow those portions to be accessible whenever available.
For simplicity of illustration we will only describe the methods for
fetch accessing from CPU.  Data stores will be handled similarly.
Although the concept will be described for a simpler cache directly
accessed by a processor, it can be applied to other cache
organizations whenever appropriate.

      In the following we consider a design in which at most one
incomplete cache line is filled at any moment. Generalizations to
designs with multiple incomplete lines should be straightforward.
The figure depicts a processor organization.  The I/E-units may
request data access to the cache unit.  Data fetch will be returned
(to requesting I/E-unit) with doubleword granule (which may be
shifted/rotated internally).  Within the cache unit there is a cache
directory (DIR) and cache arrays (ARR) as usual. BUF is a typical
speed-match buffer used for missed line fetch (e.g., from second
level cache or from main memory). Upon a cache miss, data fill
(putaway) to ARR will be done on doubleword basis, whenever the
doubleword data is available and there is a free array cycle.  When a
doubleword is fetched (e.g., from main memory) and cannot be put away
(e.g., due to ARR busy), the doubleword will be buffered in BUF
first.  A bit-vector V is used to indicate the readiness in ARR for
doublewords of a line being filled. V may be implemented at the
control of either DIR or ARR, depending upon particular design
choice.  Certain simple mechanism (e.g., using a special r...