Browse Prior Art Database

Method for a dynamically resizable circular store queue for large out-of-order processors

IP.com Disclosure Number: IPCOM000033799D
Publication Date: 2004-Dec-28
Document File: 3 page(s) / 34K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method for a dynamically resizable circular store queue for large out-of-order processors. Benefits include improved functionality and improved performance.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Method for a dynamically resizable circular store queue for large out-of-order processors

Disclosed is a method for a dynamically resizable circular store queue for large out-of-order processors. Benefits include improved functionality and improved performance.

Background

              Conventionally, speculatively executed stores in an out-of-order processor are tracked in a structure (StoreQ) until they retire and drain into merge buffers or caches. StoreQ is allocated in program order in the front end and written with a physical address and data at execution time. For large out-of-order cores, a large StoreQ is required to prevent frequent stalling when the front-end runs out of StoreQ entries. However, StoreQ forwards data to younger overlapping loads and is in the critical path of data delivery to execution units.

              Previous implementations focused on reducing the effective size requirements of the StoreQ by draining them quickly after retirement into merge buffers between the StoreQ and the cache. A merge buffer, however, puts additional pressure on the data delivery multiplexer (MUX) and adds to the complexity of merging and deallocating within the buffers.

              Conventional implementations design StoreQ either for the most common workloads and acquire a performance penalty on store-streaming applications. Some implementations use a large data-forwarding StoreQ in the critical path or a merge buffer to absorb retired stores faster. Both methods incur costly design cycles due to critical paths and control complexities.

General description

              The disclosed method is a dynamically resizable circular store queue for large out-of-order processors. The method splits the StoreQ structure into two parts. One half is in the critical path for data delivery. The other half is outside the critical path and is used only to accommodate store-streaming code sections.

Advantages

              The disclosed method provides advantages, including:

•             Improved functionality due to providing a hybrid dynamically resizable StoreQ that adapts to the application

•             Improved performance due to the large size of the StoreQ

•             Improved timing because only half of the StoreQ structure lies in the critical path for data delivery

 


Detailed description

              The disclosed method divides StoreQ into two parts because more than 97% of loads hit in the StoreQ only when the StoreQ is less than half full. For example, let each half...