Browse Prior Art Database

Optimizing FUTEX (Fast User Space Mutex) for Contended Cases to Save CPU Cycles

IP.com Disclosure Number: IPCOM000236752D
Publication Date: 2014-May-14
Document File: 6 page(s) / 782K

Publishing Venue

The IP.com Prior Art Database

Abstract

Multiprocessor system synchronization between multi-processes and multi-threads is very important. In this paper we present a method of user space synchronization. The method includes a LOCK method based on FUTEX (Fast User Space Mutex) that optimizes user space synchronization like read/write lock, MUTEX and condition variables. This method save CPU cycles by saving the extra futex wake-up system call when the LOCK is contended. As synchronization is often used in multi-threaded programming and lock contention has a significant part in the overall lock acquiring path, this effectively improves the overall user space processing.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 56% of the total text.

Title

Optimizing FUTEX (Fast User Space Mutex) for Contended Cases to Save CPU Cycles

Abstract

Multiprocessor system synchronization between multi-processes and multi-threads is very important.  In this paper we present a method of user space synchronization.  The method includes a LOCK method based on FUTEX (Fast User Space Mutex) that optimizes user space synchronization like read/write lock, MUTEX and condition variables.  This method save CPU cycles by saving the extra futex wake-up system call when the LOCK is contended. As synchronization is often used in multi-threaded programming and lock contention has a significant part in the overall lock acquiring path, this effectively improves the overall user space processing.

Problem

FUTEX lock is implemented using an atomic variable.  The FUTEX lock algorithm checks for contention in the user space and only if the lock is contented, the Linux FUTEX system calls of sleep and wake-up are used to implement the locking.  This state of lock is maintained in the user space with a single atomic variable.

Atomic variable value

Meaning

1

Unlocked

0

Locked

-1

Locked with sleeper(s). (This is the case of lock contention)

The problem is:

a.        The futex state “Locked with sleeper(s)” can have one or more threads sleeping but the number is unknown.

b.       When a thread tries to acquire futex, if the futex lock is available then it takes the lock and sets the state to “Locked” and if the futex is already locked then it sets the state to “Locked with sleeper(s)” and does a system call to sleep on the futex wait queue.

c.        When a thread unlocks the futex, the state is set to “Unlocked” and if the previous state was “Locked with sleeper(s)” then it does a system call to wake-up the first sleeping thread.  If the state was “Locked”, it doesn’t do a system call.

d.       The first thread in the futex queue is woken up and it cannot set the state to “Locked” because if another thread is also in the futex wait queue then it won’t be woken up on the next futex unlock. The thread sets the state to “Locked with sleeper(s)”.

e.       The problem is that if there is no thread waiting on the futex queue then on the next unlock, a system call will be made even though there is thread waiting on futex queue. This system call is a waste if there is no thread waiting in the futex queue.

In the method presented below, we avoid this wasted system call.

Solution to problem

The solution to this pr...