Browse Prior Art Database

Processor/Memory Switch Which Maintains the Temporal Ordering of Requests

IP.com Disclosure Number: IPCOM000120598D
Original Publication Date: 1991-May-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 4 page(s) / 168K

Publishing Venue

IBM

Related People

White, SW: AUTHOR

Abstract

To alleviate performance problems associated with slow memories (relative to processor cycle times), most large computer systems rely on "banking" to create the appearance of a faster memory. Typical banking schemes partition physical memory into N identical blocks, with consecutive addresses in adjacent blocks; N consecutive requests can be made before returning to the original bank. In this case, the apparent memory cycle time becomes the actual memory cycle time divided by the number of banks, N. The resulting gain in performance is at the expense of complexity. A processor sending a request to one of N banks needs a switch (or routing network) to steer its request to a given bank.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 46% of the total text.

Processor/Memory Switch Which Maintains the Temporal Ordering of
Requests

      To alleviate performance problems associated with slow
memories (relative to processor cycle times), most large computer
systems rely on "banking" to create the appearance of a faster
memory.  Typical banking schemes partition physical memory into N
identical blocks, with consecutive addresses in adjacent blocks; N
consecutive requests can be made before returning to the original
bank.  In this case, the apparent memory cycle time becomes the
actual memory cycle time divided by the number of banks, N.  The
resulting gain in performance is at the expense of complexity.  A
processor sending a request to one of N banks needs a switch (or
routing network) to steer its request to a given bank. In a
multiple-processor system, as the number of processors increase, the
rate of requests increase, and the number of banks must
correspondingly increase to maintain the same apparent memory cycle
time.  In high-performance systems, the switch's cost can be
significant since the data paths are extremely wide and small
switch/memory latencies are one key to a high-performance memory
system.  To decrease cost with minimal performance degradation, the
memory partitioning or "banking" is often hierarchical; banks are
grouped together into BSMs (Base Storage Modules).  On a given cycle,
any set of "ready" banks can be accessed provided that the set does
not include two banks from a common module.

      Fig. 1 illustrates a typical Prior Art memory subsystem.  For
each Base Storage Module (BSM) or group of banks, an arbitrator
selects (at most) one of the processors' requests destined for the
associated BSM.  A conflict-free (i.e., cross-bar) switch may be used
since the arbitrators establish a one-to-one relationship between
requestors and BSMs.  Since requests may be destined for banks which
are still busy from previous requests, a "BSM queue" is included
prior to each BSM.  Requests can experience arbitrary delays as a
result of the arbitration requirement and the random profile of BSM
queue wait times. These arbitrary delays are manifested in a temporal
reordering of fetched data as seen by a given requestor. (When a
request experiences little delay, it may return before a logically
previous request that experienced a substantial delay.)  This
temporal reordering requires two additional hardware complexities in
the return (memory to requestor) path.  First, contention can occur
between two or more BSMs attempting to return data to a common
requestor. This means that an arbitration unit must be placed in the
return path.  The arbitration unit may block some return packets for
a unknown number of cycles, thereby requiring queues for data coming
back from the memory.  Temporal reordering in such a design also
forces the requestor to provide tags (on fetch requests) and
circuitry which inspects the tags of returned data to reconstruct the
proper sequence of th...