Browse Prior Art Database

Method for natural memory alignment Disclosure Number: IPCOM000128933D
Publication Date: 2005-Sep-21
Document File: 4 page(s) / 64K

Publishing Venue

The Prior Art Database


Disclosed is a method for natural memory alignment. Benefits include improved functionality, improved performance, and improved power performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 50% of the total text.

Method for natural memory alignment

Disclosed is a method for natural memory alignment. Benefits include improved functionality, improved performance, and improved power performance.


              Cache memory is a crucial component for modern microprocessors. A low level cache, which has the shortest latency, is limited by three equally important but conflicting requirements, power, latency, and bandwidth. The ideal cache should have the lowest power consumption with the shortest latency and the widest bandwidth. In conventional designs, one requirement is traded off for another. For example, lower power usage is obtained at the cost of longer latency (or vice versa).

              For 32-bit architecture, a floating-point instruction requires 16 bytes of data and an integer instruction requires 4 bytes of data. To maintain high processor performance, integer instructions require shorter access latency while floating-point instructions can tolerate longer access latency. The data are typically aligned or rotated based on the instructions. This alignment is complete outside of the cache, which is organized in some binary or sequential order. Sixteen bytes of data are read out from the cache regardless of the type of instruction, such as floating point or integer. This approach is required because the correct alignment cannot be determined until the actual alignment is completed. This approach is inefficient. The entire cache design must be designed to maintain the highest bandwidth requirement, which slows down cache access as the circuitry must support the higher loading requirement of the wide bandwidth. Additionally, high power consumption occurs due to the wide bandwidth cache access, which is especially inefficient because most 32-bit cache accesses are integer instructions (4 bytes).

General description

              The disclosed method enables a cache to perform self-alignment internally. The internal cache organization is “natural” to cache access.

              The disclosed method exploits the differences between floating-point/integer (FP/INT) instructions. The disclosed method accesses four bytes of data during an integer instruction, which reduces power consumption and improves latency. For floating point accesses, the method accesses cache in the conventional way. Because most applications require integer instructions, the power consumption is reduced and performance is improved.


              The disclosed method provides advantages, including:

•             Improved functionality due to accessing cache in amounts based on the type of instruction (FP/INT)

•             Improved functionality due to enabling a cache to self-align memory internally

•             Improved performance due to reducing the latency of cache accesses for integer instructions

•             Improved power performance due to opportunistically disabling part of the cache circuitry during cache accesses for integer instructions

Detailed description