Browse Prior Art Database

Use of a Hardware Monitor to Create Send/Receive within CSRs

IP.com Disclosure Number: IPCOM000105704D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 4 page(s) / 142K

Publishing Venue

IBM

Related People

Ekanadham, K: AUTHOR [+2]

Abstract

Given a sequential program the task of creating a set of CSRs that execute the program correctly requires that memory accesses among shared data be coordinated using SEND/WAIT&RECEIVES. The use of a hardware monitor to assist in this task provides the means of identifying the required coordination.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 39% of the total text.

Use of a Hardware Monitor to Create Send/Receive within CSRs

      Given a sequential program the task of creating a set of CSRs
that execute the program correctly requires that memory accesses
among shared data be coordinated using SEND/WAIT&RECEIVES.  The use
of a hardware monitor to assist in this task provides the means of
identifying the required coordination.

      To operate properly in a DMS, the CSRs must assure that the
correct pattern of set/use is employed on data that is shared between
CSRs.  The same set of CSRs operating with sequential serialization
and with a shared memory will create memory events which correspond
to the SEND/RECEIVE steps that would be required in a DMS.  A
hardware monitor attached to the memory system can correlate the
memory events with software instructions within the proposed CSRs and
thereby pinpoint the SEND/RECEIVE requirements of a proposed set of
CSRs.

      There are two distinct types of parallelism which can be
categorized as Coarse-Grained (CG) parallelism and Fine Grained (FG)
parallelism.  Fine-grained parallelism operates on the instruction
level and partitions a putative instruction stream that has a single
logical register file and a single memory hierarchy among several
processor elements.  As such, fine-grained parallelism allows
successive instructions to be executed in parallel and requires that
the result of such executions conform to a RUBRIC OF SEQUENTIAL
CORRECTNESS.  Another implication of this is that the memory
hierarchy that supports fine-grained parallelism is common to all
processor elements that share the same putative instruction stream.

      The basic computational entity within coarse-grained
parallelism is a THREAD which is given a name.  Each THREAD is said
to comprise a sequence of steps (beads) which are one of the
following types:

1.  Compute Step (Using Local Memory/Registers)

2.  Conditional Fork and Thread(Name) Creation

3.  Send Buffer to Name

4.  Wait & Receive Buffer

These threads are called CSR because of the compute-send-receive
aspect of their structure.  The definition of the COMPUTE-STEP
involves a long sequence of instructions that operate within the
context of a local memory which is comprised of private registers and
a private memory hierarchy.  The operation of the SEND-BUFFER and
WAIT&RECEIVE-BUFFER is performed in conjunction with the local memory
associated with the named-THREAD, and different named-THREADS can
have different templates for realizing the structure of the local
memory within the common hardware.  An important parameter of such
coarse-grained parallelism is the ratio of the COMPUTE-STEP time to
the SEND-BUFFER time.  Coarse-grained parallelism usually involves a
distributed memory system in which each CSR is supported by its own
private memory.

      CSRs operate with a Distributed Memory System (DMS) and the
coordination of shared data is accomplished solely by the means of
SEND/RECEIVE.  In...