Browse Prior Art Database

Method for utilizing the MwaitExchange instruction Disclosure Number: IPCOM000016696D
Publication Date: 2003-Jul-09
Document File: 3 page(s) / 56K

Publishing Venue

The Prior Art Database


Disclosed is a method for utilizing the MwaitExchange instruction. Benefits include improved performance.

This text was extracted from a Microsoft Word document.
This is the abbreviated version, containing approximately 45% of the total text.

Method for utilizing the MwaitExchange instruction

Disclosed is a method for utilizing the MwaitExchange instruction. Benefits include improved performance.


        � � � � � Until now, acquiring a memory lock involves an endless repeated spinning on the locked memory. A spin-lock consumes 100% of the CPU on which it runs. Hyperthreading (HT) architecture does not free the CPU for the other HT processor. With the Monitor/Mwait instruction pair, a spin-lock could be implemented with a lower probability of acquiring a lock.

        � � � � � On some processor generations, optimization is conventionally enabled by the Halt and Pause instructions. Two additional instructions improve optimization:

•        � � � � Monitor - Sets up monitoring of a logical address range for a store to that address range by another agent.

•        � � � � Mwait - Wait until there is a store to the monitored address range or other events. If Monitor not set up, Mwait simply exits.

        � � � � � The use of these instructions is described in pseudo code (see Figure 1). The running processor is stalled in line 13 until the memory is changed. The While loop must repeat and guarantee that the condition is true. If the While statement (line 4) finds the memory to match the required condition (TriggerData), the loop exits at line 20.

        � � � � � If more than one processor is waiting for this condition to happen (such as in the spin-lock implementation when the write to memory has a value of resource unlocked), both processors attempt to acquire the freed resource, using an atomic exchange instruction that succeeds for one of them and fails for the other. The failed processor must rerun the code while the other processor continues with its acquired resource. An example of spin-lock processing is described using pseudo code (see Figure 2).

General description

        � � � � � The disclosed method is optimized to improve the performance of other logical processors by using the MwaitExchange instruction, which is used with the Monitor instruction and replaces the Mwait instruction. MwaitExchange has one parameter not available with Mwait, WriteValue. It is written to the monitored memory. When the memory is in this use, the parameter value is LOCKED. When the memory is free, the parameter value is UNLOCKED. The value written by the other processor is read by this instruction and replaced by the new value atomically, using a lock mechanism.

        � � � � �


        � � � � � The disclosed method provides advantages, including:

•        � � � � Improved performance due to reducing CPU utilization while waiting for a resource

•        � � � � Improved performance due to increasing the probability of the processor acquiring the lock

•        � � � � Improved performance due to elevating the performance of other processors not waiting for the resource

Detailed description

        � � � � � On an HT system, if one logical processor has no work (such as an idle loop), the disclosed method uses microarchitectural resources to optimize the...