Browse Prior Art Database

Method for hierarchical data prefetching on a memory hierarchy

IP.com Disclosure Number: IPCOM000029127D
Publication Date: 2004-Jun-16
Document File: 4 page(s) / 31K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for hierarchical data prefetching on a memory hierarch. Benefits include improved functionality, improved performance, and improved design flexibility.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 29% of the total text.

Method for hierarchical data prefetching on a memory hierarchy

Disclosed is a method for hierarchical data prefetching on a memory hierarch. Benefits include improved functionality, improved performance, and improved design flexibility.

Background

              The memory hierarchy on a modern high performance microprocessor is always partitioned into multiple levels of cache plus main memory. Focusing on a particular level of cache is typically insufficient to improve overall performance.

              Many commercial applications, such as a database server, have very large data sets, which do not fit into any cache level. Stalls from data cache become a significant performance bottleneck for these applications. They are typically pointer-intensive and conventional techniques may not solve the data cache performance issue. For example, the data prefetching techniques developed for array-based applications are ineffective for pointer-intensive applications. The data access patterns are irregular and prefetching addresses are difficult to calculate sufficiently in advance to hide the latency.

              Increasing the size of cache can alleviate this problem but often at the expense of increasing the cache access latency. Software techniques are used, such as field reordering of data structures and manipulating data allocation or reallocation. However, they require hardware and software support. In some cases, they could violate program semantics if not done carefully. Among these different techniques, data prefetching remains a promising technique to address the data cache performance issue. It typically has no immediate effect on correctness and any additional hardware support is usually not on the critical path.

              The trend of commercial applications is the size of working sets getting larger. For a large working set, the conventional wisdom of always prefetching data to the lowest cache leads to poor performance because of cache pollution and bandwidth requirements.

              Each prefetcher has its strengths and weaknesses, and no single known prefetcher fits all levels of a memory hierarchy. For a prefetching mechanism to be effective, the following issues must be addressed:

•             Identifying prefetching candidates: Data to be used soon and loads that are likely to miss in caches must be identified.

•             Lussing prefetch sufficiently early: Addresses must be obtained so that prefetches are issued well in advance to hide the latency, which is challenging because the address may be the target of another load. A key part of this issue is related to whether work is available to overlap with prefetching latency.

•             Overly aggressive prefetching: If prefetches are issued too aggressively, useless prefetches, where a prefetched data is kicked out of cache before it is used, may be issued. Overly aggressive prefetching may significantly increase bandwidth usage, which degrades performance. Aggressive prefetching may...