Browse Prior Art Database

Method for reserving CPU cache memory for a high-performance application scratchpad

IP.com Disclosure Number: IPCOM000019311D
Publication Date: 2003-Sep-10
Document File: 7 page(s) / 121K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for reserving central processor unit (CPU) cache memory for a high-performance application scratchpad. Benefits include improved functionality, improved performance, and improved ease of implementation.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 21% of the total text.

Method for reserving CPU cache memory for a high-performance application scratchpad

Disclosed is a method for reserving central processor unit (CPU) cache memory for a high-performance application scratchpad. Benefits include improved functionality, improved performance, and improved ease of implementation.

Background

         Caching provides CPUs with much faster access to program instructions and data than if they have to access data from system memory. CPU designers, therefore, include fast caches (sometimes in multiple levels, such as Level 1 and Level 2, which are designated as L1 and L2). They are automatically managed by sophisticated caching algorithms in hardware. However, these caching algorithms are typically imperfect and sometimes mispredict what should continue to reside in the cache and what should be swapped-out to make room for additional data/instructions. This imperfection occurs because the caching algorithms have no domain-specific knowledge of the running application (see Figure 1).

         Conventional systems enable applications to reserve and manage memory on disk drives and system main memory. The shortcoming of this approach is that these memories have very high latency, which typically results in poor application performance.

         Conventionally, heuristics embedded in CPU caching policies dictate which data is stored in cachelines and for how long. While some processors provide prefetching instructions, they are typically treated as hints and the appropriate memory may not be brought into the cache. Furthermore, data placed in the cache is not ensured to be available for the duration of the time that the application requires the data.

         Conventional applications and operating systems (OSs) use system memory to allocate and manage data. The OS determines which application should run. The CPU and its caching policies determine what should reside in the CPU cache (based on which applications are running and which data/instructions are being used frequently). The applications do not control what resides in the CPU cache. This approach significantly hurts the application and system performance if the OS and/or CPU heuristics fail to predict which instructions/data the application(s) will use next. This situation frequently happens if the core application loops do not fit in the cache or if the application is swapped-out frequently for some reason.

         One conventional channel processor enables application developers to pin and unpin cache-lines that correspond to specific variables in main memory. The data, however, continues to reside in system memory. This approach consumes main memory resources and requires data to be written back to the main memory when unpinning occurs for modifications. This solution is a poor scheme for providing temporary fast storage. A solution is required that overcomes these problems.

         Application data is likely to be swapped-out of cache for the following reasons:

•         Application’s low priority within OS scheduling...