Browse Prior Art Database

Pipelined locking for fine-grained parallelism

IP.com Disclosure Number: IPCOM000245340D
Publication Date: 2016-Mar-02
Document File: 2 page(s) / 26K

Publishing Venue

The IP.com Prior Art Database

Abstract

Disclosed is a method to improve the performance of software by making the synchronization finer grained, improving the thread-level parallelism. Our target application has a critical section consisting of multiple tasks, each task fetched from a different series of tasks. Each thread must take tasks from the multiple series of tasks in the same order. Our technique employs fine-grained locking for each task in a pipelined manner.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 72% of the total text.

Page 01 of 2

Pipelined locking for fine-grained parallelism

Abstract:


Disclosed is a method to improve the performance of software by making the synchronization finer grained, improving the thread-level parallelism.

Our target application has a critical section consisting of multiple tasks, each task fetched from a different series of tasks. Each thread must take tasks from the multiple series of tasks in the same order. Our technique employs fine-grained locking for each task in a pipelined manner.

This technique handles the software having a critical section, which consists of multiple tasks. Here, each task is taken from a disjoint series of tasks. In the figure, "task A_i" means a i-th task fetched from the series A.

Here, we need to satisfy the following conditions:
(1) We must execute tasks from one series in order. That is, we cannot start executing task A_i+1 before the execution of task A_i ends.

(2) Each thread must execute the tasks from the all series of tasks in the same order, i.e. if a thread take task A_i, it must execute task B_i and task C_i, but not task B_i-1 or task B_i+1.

The simplest solution is using a giant lock which guards the entire critical section. (Left side of the figure)

However, this solution performs poorly on a parallel system due to the lack of thread-level parallelism at this critical section.

Our technique increase the parallelism by using multiple fine-grained locks and hence make multiple threads, up to the number of tasks in the criti...