Browse Prior Art Database

Method for the hot stack coprocessor Disclosure Number: IPCOM000007897D
Publication Date: 2002-May-02
Document File: 4 page(s) / 22K

Publishing Venue

The Prior Art Database


Disclosed is a method for the hot stack coprocessor. Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 46% of the total text.

Method for the hot stack coprocessor

Disclosed is a method for the hot stack coprocessor. Benefits include improved functionality and improved performance.


              Computing hardware has a sophisticated hierarchy of increasingly faster memories to match the ever-increasing disparity between memory and processor speeds. Conventional processor instruction sets include instructions that enable compilers to exploit this hierarchy, such as prefetching in the 64-bit architectures. Prefetching hides some of the stalls induced by cache misses, but it requires the prefetch instruction to be issued in advance. The compiler issuing the prefetch can estimate the required prefetch distance, but it may not be able to issue the prefetch that far in advance. Due to memory system contention, the estimate may be far too optimistic.

              From compiler analysis, data can be divided into a small amount of hot data items for which efficient access is critical and the vast majority of cold data items, for which efficient access is less important. Conventionally, both types of data are allocated together on the stack, diluting the effectiveness of the cache.

General description

              The disclosed method is an extension to the existing memory hierarchy that can be controlled more effectively by the compiler. The method allocates the hot data to a special locked region of the data cache that is managed by a coprocessor in such a way as to virtually guarantee cache hits for hot data. The locked region of the data cache is called the hot stack.

              The disclosed method consists of hardware components and software components, including:

§         Locked cache region (or a separate memory of comparable performance)

§         Coprocessor to provide access to the locked cache region and to asynchronously back it up into main memory

§         Algorithm to divide program data into cold and hot data

§         Algorithm to layout data and exploit the locked cache region


              The disclosed method provides several advantages, including:

§         Successfully hides all stalls due to cache misses as long as the average rate of data transfer for accessing hot data does not exceed the available bandwidth due to the asynchronous automated backup mechanism

§         Allocation, deallocation, and the hot stack access operations require a single instruction each on low power, high performance processors 

§         Aggressive compiler optimizations that have been conventionally avoided due to an increase in register use now become profitable because memory access for hot data is predictable

§         Predictable memory access for a limited amount of hot data is beneficial in real-time applications

Detailed description

      The disclosed method extends the conventional memory hierarchy. Operation of the method is detailed using several categories, including:

§         Coprocessor state

§         System interface

§         Hardware operations

§         Compiler support

§         Implementation

Coprocessor state

              The hot stack is a stack of words controlled via four instructions:

§         Allocate n words on th...