Browse Prior Art Database

Software-controlled Store-in Caches in Multiprocessor Systems

IP.com Disclosure Number: IPCOM000099379D
Original Publication Date: 1990-Jan-01
Included in the Prior Art Database: 2005-Mar-14
Document File: 2 page(s) / 62K

Publishing Venue

IBM

Related People

Chang, JH: AUTHOR [+2]

Abstract

A store-in cache not only keeps the recently fetched data of a CPU, but it also minimizes the store traffic between the CPU and the memory. Compared to a store-thru cache, it allows the main memory in a multiprocessor (MP) system to be shared by many more CPUs.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Software-controlled Store-in Caches in Multiprocessor Systems

       A store-in cache not only keeps the recently fetched data
of a CPU, but it also minimizes the store traffic between the CPU and
the memory.  Compared to a store-thru cache, it allows the main
memory in a multiprocessor (MP) system to be shared by many more
CPUs.

      Suppose that the cache-based MP system has no hardware to
handle cache coherence but provides specific instructions for
software to control the cache contents (RP3 is an example of such an
MP system).  Since conventional store-in caches only allow data
sharing on a line basis, a shared (read and write) line can only be
loaded to the cache of one CPU at a time.  Even if another CPU wants
to share a different part of the line, the whole line has to be moved
from one cache to another.  Otherwise, the memory can be in an
inconsistent state when it is updated by different versions of the
line later.  These kinds of shared lines, which are very common in
many workloads, either result in a lot of cache cross interrogations
(XIs) or cannot be loaded to any cache at all.  One obvious solution
is to maintain a change bit for every access unit (for example, one
word) in a line, but this requires a tremendous amount of change bits
for the cache.  For example, a cache of 64 Kbytes needs 2 Kbytes of
change bits.

      A scheme is disclosed to substantially reduce the amount of
change bits and allow shared lines to be loaded to more th...