Browse Prior Art Database

Cache Optimized Structure Supporting Global Object Locks

IP.com Disclosure Number: IPCOM000112806D
Original Publication Date: 1994-Jun-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 8 page(s) / 362K

Publishing Venue

IBM

Related People

Funk, MR: AUTHOR [+4]

Abstract

A structure for the support of Global Object Locks in a manner which efficiently uses a processor cache is disclosed. Also discussed is a storage management mechanism for further enhancing this efficiency.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 14% of the total text.

Cache Optimized Structure Supporting Global Object Locks

      A structure for the support of Global Object Locks in a manner
which efficiently uses a processor cache is disclosed.  Also
discussed is a storage management mechanism for further enhancing
this efficiency.

      High performance processor designs rely on the existence of a
cache; one or more minimal access latency storage buffers from which
processors access data and which hold a subset of the data found in a
longer latency main storage.  Typically, a cache maintains its data
in blocks of storage aligned on a storage boundary equal to the size
of the block.  Each portion of the cache capable of holding such a
block of storage is called a cache line.  When a processor accesses
the cache and incurs a cache miss (i.e., storage block not found in
the cache), main storage is accessed to bring an associated block of
storage into a cache line.  In some cache designs (i.e., store-in /
write-in), if the cache line selected for replacement had been
changed, the block of storage in that cache line must first be
written back to main storage or to some lower level of cache.  A
cache miss often results in subsequent processing to be stalled until
data is returned from main storage.  This process can take many
multiple 10's of processor cycles.  The latency can be worse when
multi-processor cache coherency is considered.  If a processor "A"
contains a changed block of storage required by a processor "B",
processor "A" must transfer the block of storage to processor "B",
perhaps first by writing it back to main storage.  The point of this
paragraph is to say that there is a potentially severe impact on the
peak performance of a processor whenever it must access data which it
does not currently contain in its cache.

      Similarly, processors support one or more Translation Lookaside
Buffers (TLBs) for translation of effective addresses (and virtual
addresses) into physical main storage addresses.  A typical example
is for the translation of a virtual page number into a real page
number where a page is 512 - 4096 bytes in size and aligned on an
integral boundary.  The TLB contains a small subset of all of the
pages held in main storage.  When translating an address and that
address is not found in the TLB, the processor stalls until the
hardware has translated the address through translation tables
maintained by software in main storage.  As with the cache there can
be a significant performance penalty on a TLB miss.  The TLB miss
rate increases as the number of different addresses (i.e., pages)
used in a short period of time increases.

      Some data structures, for example hash tables, have no
guaranteed locality of reference either in terms of the cache or the
number of pages accessed.  In addition to the storage access
associated with a hash table's chain anchor, each element on a hash
table's linked list might be in a different block of storage
(differen...