Browse Prior Art Database

Using WorkPackets for improved load balancing in multithreaded object tracing in Garbage Collection Disclosure Number: IPCOM000020394D
Original Publication Date: 2003-Nov-19
Included in the Prior Art Database: 2003-Nov-19
Document File: 4 page(s) / 61K

Publishing Venue



The method described here uses the concept of WorkPackets: The WorkPacket has a stack, whose capacity is smaller then that of the traditional mark stack. WorkPackets are retrieved and returned to global lists shared by all possible tracers, via simple and efficient C&S operations.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 27% of the total text.

Page 1 of 4

Using WorkPackets for improved load balancing in multithreaded object tracing in Garbage Collection

Related concepts and terms in garbage collection : The process of tracing all reachable (live) objects in the heap is a basic component of most of the existing garbage collectors. This tracing is usually done by using a mark stack, with push and pop operations. Each object, when first reached, is marked and pushed to the stack. It is later popped from the stack and traced. In trace step, all objects that are referenced from this object, and are not marked yet, are marked and pushed to the mark stack. The marking of objects is done in order to insure that an object is pushed to the mark stack only once. Marking is either done by using a dedicated data field in the object, or by maintaining a separate bit-per-possible-object-slot vector, called the markbits vector.

  The collection may be done when the execution of the application is suspended (Stop The World collection, or STW), or in parallel to the execution of the application (concurrent collection), or mostly in parallel to the application execution, with final (shorter) part where the application is suspended (mostly concurrent collection).

  In all these collectors, the collection, or at least the tracing part of it, may be done simultaneously by several threads, each using a private mark stack. Two, or more, threads may reach the same object simultaneously (if this object is referenced from many objects). The (exclusive) pushing of this unmarked object is done by the thread which wins the race for marking the object. The marking data structures are common and the race is usually handled by using synchronized operations such as compare-and-swap (C&S).

  The advantages of multithreaded STW collection on an SMP machine are obvious. The concurrent, and mostly concurrent, may benefit from being multithreaded by acquiring better ability to control the rate of tracing relatively to the rate of allocation, in order to complete the tracing in time, before the heap is exhausted. Multithreaded STW collection is usually done by a fixed, predefined number of threads. This is not necessarily the case In the concurrent and mostly concurrent collections - the number of participating threads at any point of time may vary from none, to one, or to many. An example for such a collection is when JAVA mutators threads are doing small tasks of concurrent tracing as part of each allocation request from the heap.

The problem:

  Multithreaded tracing creates problems of load balancing. Some of the threads may have empty mark stacks and be starved for work, while others may have sufficient volume of work on their mark stack, or ever reach mark stack overflow state.

  It is therefore necessary for any multithreaded tracing collector to include a load balancing mechanism, so all participating threads can get an equal share of the marked objects, even if they were not the ones to actually mark them.

  In collectors, s...