Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method for a load-buffer operation for load-store processors

IP.com Disclosure Number: IPCOM000012205D
Publication Date: 2003-Apr-16
Document File: 4 page(s) / 71K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a load-buffer operation for load-store processors. Benefits include improved performance and improved ease of implementation.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 51% of the total text.

Method for a load-buffer operation for load-store processors

Disclosed is a method for a load-buffer operation for load-store processors. Benefits include improved performance and improved ease of implementation.

Background

        � � � � � Load-store architectures are the basis for RISC processors. The architecture is also becoming more popular in high-performance specialized processors, such as very long instruction word (VLIW) digital signal processors (DSPs) and packet processors. Load-store architectures require that data residing in memory be loaded to an operand register before it can be used by succeeding instructions. The number of cycles from when a load operation is executed until the data can be used as a registered operand by a future instruction is known as the load-use penalty. The most effective technique for mitigating the load-use penalty is software pipelining. This technique is particularly effective in multiple issue architectures, such as VLIW. Load operations that get data for succeeding instructions can be issued at the same time the register operands from a previous load are being operated on.

General description

        � � � � � Disclosed is a method for a load buffer operation for load-store processors. It can be used in any processor design where a large amount of data residing in memory must be processed. The operation may be used in the processor core.

Advantages

        � � � � � The disclosed method provides advantages, including:

•        � � � � Improved performance due to simplifying stall detection for load data

•        � � � � Improved performance due to being a clean method for providing memory data to the processor core

•        � � � � Improved performance due to reducing the number of multiported registers required to software-pipeline load data

•        � � � � Improved performance due to reducing� program memory size requirements, as fewer bits are required to encode operands

•        � � � � Improved ease of implementation due to requiring one read port and one write port per load buffer

Detailed description

Disclosed is a method for a load buffer operation for load-store processors. For example, assume that some number, n, of data operands must be added together and the load-use penalty is four cycles. Also, assume a simple VLIW model where one memory instruction can be issued in parallel with one arithmetic instruction (see Figure 1). Pseudo-code can describe the software pipelining technique (see Figure 2).

        � � � � � This technique requires extra operand registers (R2, R3, R4, R5) for reserving space for in-flight load data. In this example, only one source operand from memory is required by the add operation for every cycle, so one extra register is required for every cycle of load-use penalty. If two source operands from memory are required for every cycle, then the number of extra operand registers required doubles. As the number of pipeline stages grows to keep up with the ever-increasing CPU clock rate, more operand registers are requir...