Browse Prior Art Database

Data Merging for Shared Memory Multiprocessors

IP.com Disclosure Number: IPCOM000105826D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 2 page(s) / 87K

Publishing Venue

IBM

Related People

Karp, AH: AUTHOR [+2]

Abstract

A process is described for keeping shared data consistent in a shared-memory multiprocessor system. This process is more efficient than previous solutions because it only performs inter-processor communication when there is true sharing of data among processors; in all previous solutions, either additional work is done or inter-processor communication is generated when there is false sharing of data among processors, i.e., accesses to distinct data words contained within the same cache line or virtual memory page.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Data Merging for Shared Memory Multiprocessors

      A process is described for keeping shared data consistent in a
shared-memory multiprocessor system.  This process is more efficient
than previous solutions because it only performs inter-processor
communication when there is true sharing of data among processors; in
all previous solutions, either additional work is done or
inter-processor communication is generated when there is false
sharing of data among processors, i.e., accesses to distinct data
words contained within the same cache line or virtual memory page.

      The process is described for a shared-memory multiprocessor
system in which each processor has a private cache memory that can
hold some number of data blocks (cache blocks) from the global shared
memory.  A data block contains some number of data elements.  The
terms cache memory and data block are used in a generic sense; this
process also applies to virtual pages being moved between (private)
local memories and (shared) global memories, as well as to multiple
levels of hierarchical shared memory.

      A description of the process is outlined below.  This process
uses the fact that it is a programmer error for two or more
processors to modify the same element in a data block without using
proper synchronization.

o   Associated with each data block in global memory is

    1.  A bit vector of size B bits, where B = number of data
        elements in a data block.  The bit vector identifies the data
        elements that have been modified in the data block, and is
        initialized to all zeroes.

    2.  A counter of size (log P) bits, where P = (maximum) number of
        processors in the shared-memory multiprocessor system.  The
        counter identifies the number of processors that currently
        have a copy of the data block in their cache memory, and is
        initialized to zero.

o   When a processor requests a data block from global memory, the
    corresponding counter is incremented by one.

o   Each processor asynchronously reads and writes data elements in
    its private copy of the data block, i.e., in its cache block.

o   There are three possible ways for the cache block to be removed
    from the processor's cache memory:

    1.  When the cache block is evicted to make room for a newly
        requested cache block, and the old cache block has no dirty
        bits [*]  (i.e., no data elements had been written into the
        old cache block).

        In this case, the counter for the corresponding data block in
        global memory is decremented by one.  No other communication
        with global memory is necessary.

    2.  When the cache...