Browse Prior Art Database

Shared Memory Design With Input And Output Queues

IP.com Disclosure Number: IPCOM000121118D
Original Publication Date: 1991-Jul-01
Included in the Prior Art Database: 2005-Apr-03
Document File: 4 page(s) / 166K

Publishing Venue

IBM

Related People

Foster, DJ: AUTHOR [+2]

Abstract

A shared memory design is described suitable for bus-based or network- based multiprocessor systems. It details a high bandwidth, low latency global memory system. The use of output and input FIFOs lessens the bandwidth limiting due to non-uniform bus utilization which is normally a problem in such memory designs. Introduction

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Shared Memory Design With Input And Output Queues

      A shared memory design is described suitable for
bus-based or network- based multiprocessor systems.  It details a
high bandwidth, low latency global memory system.  The use of output
and input FIFOs lessens the bandwidth limiting due to non-uniform bus
utilization which is normally a problem in such memory designs.
Introduction

      One potential bottleneck in any multiprocessor system is the
shared (or global) memory subsystem.  Whether the multiprocessor is
bus-based or network (switch)-based, the memory subsystem must be
capable of processing requests at the full bandwidth of the
processor- memory bus or switch. As bus (or network) traffic in a
multiprocessor system will tend to utilize the bus in a non-uniform
fashion, the memory bandwidth should be tolerant of instances when
the bus is highly loaded.  As well as providing high bandwidth, the
memory should also present low latency to processor requests.  This
memory system was designed for the Advanced Computing Environment
(ACE) multiprocessor.  This system contains 8 processors, connected
over a custom 80 MByte/second inter processor communication (IPC)
bus.  Because the processors execute byte of halfword (i.e., 16 bits)
operations to memory via read- modify-write cycles, the memory had to
provide a lock capability for any memory address.
ACE Shared Memory System

      The memory design presented uses multiple interleaved banks of
fast dynamic random-access memory (DRAM) to provide the required high
bandwidth and low latency.  A first-in, first-out (FIFO) memory  is
used on the output path from each pair of memory banks.  This allows
the memory to continue accepting requests even when it has data
queued for output on the bus.  The use of a second FIFO is proposed
for the input stage to the memory so that processor requests are not
rejected when a memory bank is either busy or in a locked state
servicing a read-modify-write request.  A consistency protocol is
suggested for this input FIFO to ensure that requests are not
serviced out of sequence.

      With the current cycle time of DRAM being some four times the
IPC cycle time (and with page mode operation prohibited because of
the unpredictable nature of requests in a multiprocessor), the memory
in its minimum configuration is interleaved four ways.  This
interleaved memory design provides the necessary input bandwidth (on
average) from the IPC bus.  As requests from the memory for bus
access are not necessarily granted immediately, an output FIFO memory
is used on each memory card.  Read data from both banks on each
memory card is stored in the FIFO, together with the processor
identifier tag necessary to identify the read request.  This means
that memory access can continue uninterrupted, even if the memory has
data queued for output on the bus.  An output arbitrator will then
output all of the data in the FIFO whenever the bus i...