Configurable Level-1 Caches that can function as Queues, Streams, and Vector Registers
Publication Date: 2010-Sep-17
The IP.com Prior Art Database
Disclosed are hardware enhancement of a level-one (L1) cache, and some corresponding enhancements to the register file, that allow individual cache ways to be configured as queues or streams or vectors by user programs.
Configurable Level-1 Caches that can function as Queues , Streams, and Vector Registers
Disclosed is a method for enabling configuration of each cache way of level-1 cache as
one of a word-wide vector register, stream, and simple FIFO queue by user programs
as per requirements. Such cache features would reduce conflict misses in shared
cache. Further this allows a sequential thread to be split into a cache-way-loader thread
(memory access thread) and a compute thread, thus reducing the number of
independent programs running simultaneously in a core and also increasing the
effective instruction-level-parallelism per program (by issuing two of its threads
The configurability of the cache ways is implemented via introduction of read- and
write-pointers and one or more special purpose registers (SPRs) in each cache way.
In an exemplary implementation of the disclosed method, each cache way is equipped
with the following SPRs:
special-purpose register (SPR) that indicates the current configuration of each cache way. This SPR can be set by a user mode instruction such as mtspr . Each cache way can be in one of several states: normal cache way, software-managed memory, queue, vector, or stream.
MFLOOR and MCEILING SPRs
cache way, which indicate the range of memory addresses held in that cache way when the cache way is configured to be something other than a normal cache way.
A WAYSTRIDE SPR to indicate the stride of the data held in that way.
A WAYSIZE SPR per cache way. This is required for vector registers when vectors are shorter than the hardware way size, but are not required for queues or streams.
A read pointer (READPTR) SPR and write pointer (WRITEPTR) SPR per L1 cache way that are used when a way is configured as a queue, vector, or stream. These are automatically incremented, mod the number of cache words in a way , upon a queue/vector/stream read/write respectively, as described below.
A VALID bit per cache word, not just per cache line.
SPR that is used to indicate when a memory location is
cached in an invalid manner across the cache ways.
Along with the hardware modifications to the cache ways, hardware logic also requires modification operates the cache ways using the above SPRs, in addition to the base cache logic. The modification are described as follows:
On a regular cache access (LOAD or STORE), all cache ways are
accessed in parallel, as before. However, a reference address will be in the range specified by one or none of the MFLOOR, MCEILING pairs of the configured cache ways. If there is no such cache way hit in the configured ways but there is a hit in one of the non-configured ways, then the reference behaves as a normal cache reference. If there is a way hit in one of the configured ways and no hit in the normal ways, then the reference is a valid reference to a configured way and is satisfied as described below. All other cases are invalid...