Browse Prior Art Database

Method for a memory coprocessor to prefetch objects Disclosure Number: IPCOM000010334D
Publication Date: 2002-Nov-20
Document File: 4 page(s) / 147K

Publishing Venue

The Prior Art Database


Disclosed is a method for a memory coprocessor to prefetch objects. Benefits include improved functionality.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 36% of the total text.

Method for a memory coprocessor to prefetch objects

Disclosed is a method for a memory coprocessor to prefetch objects. Benefits include improved functionality.


        � � � � � Overlapping the memory access latency with CPU processing becomes increasingly important as the gap between CPU processing speeds and memory access latencies continue to widen. Performing simpler memory operations on an expensive CPU is not cost effective, especially if the processor has a deep pipeline that is stalled/flushed frequently.

        � � � � � Processors conventionally use prefetch techniques to detect frequently occurring memory access patterns (such as strides, linked lists, and graph traversal) and to prefetch additional cache lines. This scheme has several drawbacks, including:

•        � � � � Several hundred CPU clock cycles elapsing before the start of a memory access pattern and the detection of the pattern

•        � � � � Complex microcode logic to anticipate the variety of simple memory access patterns

•        � � � � Difficult to implement microcode logic for complex application-specific memory access patterns

•        � � � � Object oriented programming encourages frequent method invocations that obscure context – so, there’s a higher chance that memory access patterns lasting a small duration go undetected

•        � � � � Dynamic code execution path may touch objects in a pattern that is different from the one suggested by the underlying data structure

•        � � � � Microcode-level prefetch that has a relatively small window of instructions and cannot prefetch objects that may be referenced beyond this window

•        � � � � Prefetched data that is not used because the detected pattern is incorrect and a portion of bus bandwidth is used inefficiently

•        � � � � Prefetch of one cache line at a time

General description

        � � � � � The disclosed method is a prefetch operation for a memory coprocessor that uses virtual addresses. The method includes an additional prefetch instruction and describes the language extensions required to utilize the memory coprocessor for prefetching.

        � � � � � The disclosed method supports object prefetching from memory in a runtime environment. Multiple lists of objects can be prefetched that would otherwise be difficult to detect in the smaller window sizes available to microarchitecture. The compiler passes the size of object information, enabling the prefetching of the correct amount of data instead of the conventional cache line size and makes the prefetcher object aware. For processors designed for multithreaded server/workstation environments, the disclosed method enables memory access operations for one thread to proceed in parallel with executing instructions for another thread.

        � � � � � The method identifies cases where offloading memory operations to a separate memory coprocessor are beneficial. The added instruction enables software to exploit this division of functionality, which is beneficial and cost-effective for complex CPUs...