Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Memory Operations in Data Parallelism

IP.com Disclosure Number: IPCOM000105765D
Original Publication Date: 1993-Sep-01
Included in the Prior Art Database: 2005-Mar-20
Document File: 4 page(s) / 107K

Publishing Venue

IBM

Related People

Ekanadham, K: AUTHOR [+2]

Abstract

Within DATA PARALLELISM memory operations can be organized along more conventional grounds which take advantage of the the data partition manifest in the definition of DATA PARALLELISM. The resulting system combines SEND/RECEIVE with a means to directly access information from the memory within another processing node. The CSRs that have SEND/RECEIVES to coordinate the set/use of data that is modified and used by different nodes of a network of processors can use these selfsame CSRs to coordinate the set/use of data in a shared memory system. The concept of running a DATA PARALLELISM problem on a shared memory system is to allow the data to have a common memory map that permits data access to be resolved within a remote memory.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 45% of the total text.

Memory Operations in Data Parallelism

      Within DATA PARALLELISM memory operations can be organized
along more conventional grounds which take advantage of the the data
partition manifest in the definition of DATA PARALLELISM.  The
resulting system combines SEND/RECEIVE with a means to directly
access information from the memory within another processing node.
The CSRs that have SEND/RECEIVES to coordinate the set/use of data
that is modified and used by different nodes of a network of
processors can use these selfsame CSRs to coordinate the set/use of
data in a shared memory system.  The concept of running a DATA
PARALLELISM problem on a shared memory system is to allow the data to
have a common memory map that permits data access to be resolved
within a remote memory.  The memory mapping can distinguish
references to R/O data that can be directly accessed and referenced
to modified data that has to be coordinated by a SEND/RECEIVE
mechanism.  The SEND/RECEIVE mechanism is isomorphic to software
locks that are employed for this purpose, but with the SEND/RECEIVE
the coordination is accomplished directly in the hardware, either
within the memory mapping function or within the local memory of the
processor which owns the data.

      There are two distinct types of parallelism which can be
categorized as Coarse Grained (CG) parallelism and Fine Grained (FG)
parallelism.  Fine-grained parallelism operates on the instruction
level and partitions a putative instruction stream that has a single
logical register file and a single memory hierarchy among several
processor elements.  As such, fine-grained parallelism allows
successive instructions to be executed in parallel and requires that
the result of such executions conform to a RUBRIC OF SEQUENTIAL
CORRECTNESS.  Another implication of this is that the memory
hierarchy that supports fine-grained parallelism is common to all
processor elements that share the same putative instruction stream.

      The basic computational entity within coarse-grained
parallelism is a THREAD which is given a name.  Each THREAD is said
to comprise a sequence of steps (beads) which are one of the
following types:

1.  Compute Step (Using Local Memory/Registers)

2.  Conditional Fork and Thread(Name) Creation

3.  Send Buffer to Name

4.  Wait & Receive Buffer

These threads are called CSR because of the compute-send-receive
aspect of their structure.  The definition of the COMPUTE-STEP
involves a long sequence of instructions that operate within the
context of a local memory which is comprised of private registers and
a private memory hierarchy.  The operation of the SEND-BUFFER and
WAIT&RECEIVE-BUFFER is performed in conjunction with the local memory
associated with the named-THREAD, and different named-THREADS can
have different templates for realizing the structure of the local
memory within the common hardware.  An important parameter of such
coarse-grained parallelism is the ratio of th...