Browse Prior Art Database

Method for a high performance ALAT implementation in an OOO processor

IP.com Disclosure Number: IPCOM000033798D
Publication Date: 2004-Dec-28
Document File: 3 page(s) / 17K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a high performance advanced load address table (ALAT) implementation in an out-of-order (OOO) processor. Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Method for a high performance ALAT implementation in an OOO processor

Disclosed is a method for a high performance advanced load address table (ALAT) implementation in an out-of-order (OOO) processor. Benefits include improved functionality and improved performance.

Background

              An ALAT is an architectural feature for compiler-driven data speculation. Software can execute loads early by promoting them as advanced loads from their original positions in the program and replacing the original instructions with checks. The ALAT ensures speculation correctness when conflicting stores occur between the advanced load and the check. Two forms of data speculation occur. In the first form, software promotes the load instruction as an advanced load. The check load is left in place. A check load (Ld.c) is either a no processing (NOP) instruction when data speculation passes or a load when data speculation fails. In the second form, software promotes the load and all its dependents leaving behind a speculation check instruction (Chk.a) that branches to a recovery code if data speculation fails.

              In an in-order processor, an ALAT is implemented as a hardware table in which advanced loads are allocated with their target register identifiers and physical addresses. Subsequent store and snoop operations look up physical addresses in the ALAT and invalidate all matching entries. The check instructions, such as Ld.c and Chk.a, look up addresses in the ALAT using target register identifiers and fail only when no matching valid entry is found. This simple implementation works because the advanced loads, stores and the checks execute in program order. This technique does not work in an out-of-order processor where the operations can execute in any order. For example, the checks can finish before the stores execute.

              To solve the out-of-order ALAT issue, conventional solutions rely on executing the advanced load, the intervening stores, the check load, and the dependents of the check loads in program order. The real gain on advancing the load versus prefetching the data is partly lost due to the serializations forced between the dependents and the check load.

General description

              The disclosed method is a high-performance ALAT implementation. The key feature of the disclosed method is to perform check loads in the same way as speculation checks. The dependents of the check load are transformed into dependents of the advanced load. When the check load passes, the dependents include the correct data. When the check load fails, the bad data speculation is automatically recoverable by flushing all the older instructions. This procedu...