Browse Prior Art Database

Increasing Hit Ratios in Second Level Caches and Reducing the Size of Second Level Storage

IP.com Disclosure Number: IPCOM000042745D
Original Publication Date: 1984-Jun-01
Included in the Prior Art Database: 2005-Feb-04
Document File: 4 page(s) / 36K

Publishing Venue

IBM

Related People

Liu, L: AUTHOR

Abstract

In a processor with cache storages, each storage reference made by the processor is directed to the cache. The great majority of these references are for data found in the First Level Cache (L1). Data return for these references (so-called "cache hits") is very fast (with a typical time 1 cycle). Those storage references ("cache misses") which are not cache hits are directed to the next fastest level in the storage hierarchy. In conventional designs the next fastest level is the Main Storage (MS), which provides a data return time much longer than L1. There have been various ideas on creating more memory levels between MS and L1. Typical suggestions are Second Level Caches (L2's) which provide return times slower than L1 but much faster than MS.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 29% of the total text.

Page 1 of 4

Increasing Hit Ratios in Second Level Caches and Reducing the Size of Second Level Storage

In a processor with cache storages, each storage reference made by the processor is directed to the cache. The great majority of these references are for data found in the First Level Cache (L1). Data return for these references (so- called "cache hits") is very fast (with a typical time 1 cycle). Those storage references ("cache misses") which are not cache hits are directed to the next fastest level in the storage hierarchy. In conventional designs the next fastest level is the Main Storage (MS), which provides a data return time much longer than L1. There have been various ideas on creating more memory levels between MS and L1. Typical suggestions are Second Level Caches (L2's) which provide return times slower than L1 but much faster than MS. In an L2 cache organization, some (or all) of the L1 misses are directed to the L2 cache. When a valid copy of the referenced line is found in L2 (an "L2 hit"), the line is fetched from L2 directly. Otherwise, the reference (an "L2 miss") is directed to MS, and a copy of the line will also be fetched into the L2 cache. Since L2 is a finite cache with limited size (say, 1 megabyte), some L2 cache unit replacement may result from the fetching of new lines from MS. For the sake of efficiency of implementation, the units of L2 may usually be larger than the cache line sizes (e.g., 4096 bytes vs. 128 bytes). However, loss of locality in L2 may result from L2 replacement due to the larger unit sizes in L2, and hence will cause higher "L2-Miss- Ratios". Recent data indicated that large portions of L1 misses are concentrated on relatively few line units in real storage. If those lines that are missed frequently are detected and put in a separate buffer in L2 with a higher degree of residency, then the L2-Miss-Ratio and XI- penalties may be reduced greatly. In the system disclosed, mechanisms are provided that dynamically detect the line miss frequencies (in L1) and put such lines in a "small" buffer with higher degree of residency in the second level. The mechanism for UP (uniprocessor) systems will first be discussed. Extensions to MP (multiprocessor) systems will be discussed later. In the following, consider a storage hierarchy scheme (see the sole figure) in which the second level consists of an L2 enhanced with a buffer called FMLB (Frequently-Missed-Line Buffer). The size of FMLB is much smaller than that of L2. A typical size of FMLB is 64K, 128K or 256K bytes. Typical sizes of L2 is 1/4 megabytes, 1/2 megabytes or 1 megabytes. The L2 organization can be the usual L2 schemes, or it can be ones that allow modified data to be stored into them. In the following, first consider the case that data can be stored into L2. Assume that the lines in L2 are numbered from 1 to N. An additional frequency counter array, named F_CNT, with N entries is needed to keep track of the L1 miss frequencies of each line in...