Browse Prior Art Database

# A Conditional LRU-Algorithm for Shared Caches with Incomplete History

IP.com Disclosure Number: IPCOM000123241D
Original Publication Date: 1998-Jul-01
Included in the Prior Art Database: 2005-Apr-04
Document File: 4 page(s) / 139K

IBM

## Related People

Hanno Ulrich, D: AUTHOR

## Abstract

Abstract The paper presents a replacement algorithm for large shared caches which have incomplete knowledge about the activity on their cache lines. It is an extension to traditional LRU (Least Recently Used) algorithms which avoids to overwrite cache lines that are still in use.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 40% of the total text.

A Conditional LRU-Algorithm for Shared Caches with Incomplete History

Abstract
The paper presents a replacement algorithm for large shared caches
which have incomplete knowledge about the activity on their cache
lines.  It is an extension to traditional LRU (Least Recently Used)
algorithms which avoids to overwrite cache lines that are still in
use.

Problem Description
A cache is a matrix of columns and rows.  Each cell of the matrix
contains a cache line.  The address of a cache line determines the
row where it resides.  Within the row however, the line can reside in
any column.  The number of columns is therefore called the cache's
associativity.

LRU Algorithms
The cache replacement algorithm is used to identify the column
where a new cache line is to be installed.  Commonly used and
discussed here are LRU-algorithms which aim to locate the Least
Recently Used cache cell for replacement.
LRU-algorithms are usually implemented via pointers to the Most
Recently Used (MRU) cell.  In the following, we use young and old as
synonyms for MRU and LRU respectively.

For a true LRU-decision, any two cells in a row are
compared with one another and one bit per pair is used to point to
the younger cell.  For an n-way associative cache, '(' sub %
adjust(d 2) n above adjust(u 2) 2 sub % ')' = n * (n-1) %/% 2 bit
pointers are needed.  With each reference to a cell, all n-1
pointers referring to this cell must be updated.  Candidate for
replacement is the cell which is pointed away by all MRU-pointers,
hence the oldest cell.

Since the number of bit pointers needed for a true LRU grows
with n**2, caches of higher associativity often use a partitioned
LRU-algorithm:
First, all n cells of a cache row are partitioned in two halfs and
a 1-bit pointer points to the half which contains the youngest entry.
Then, each half is again subdivided in two halves and another two
MRU-bits are added.  For n = 2**m, we thus get n-1 MRU-bits and m =
log sub 2 (n) decision levels.  Example for n = 8:

The MRU-bits are updated upon each reference to a
cell.  This way, the algorithm identifies the youngest cell.
Candidate for replacement is the cell which is most distant from the
youngest cell.  This is not necessarily the oldest cell but the cell
which at each level belongs to the older half.

The efficiency of an LRU-algorithm relies heavily on an
exact knowledge of the activity on the cache lines.  If this is not
the case, we talk of

Caches with Incomplete History
A typical example for a cache with incomplete history is a shared
level 2 cache (L2) in a multi-processor system where each processor
has a private level 1 cache (L1).

We assume a cache coherency protocol which makes the L2 a
superset of the non-disjoint union of L1-caches.  As a consequence,
each processor must inform the L2 when it invalidates a line in its
private L1.  And the L2 in turn must inform the affected L1-caches
when it install...