Improving throughput and heap layout in a multithreaded generational garbage collector by monitoring aggregate rate of copy production, stalled threads, and other variates and adjusting copy cache sizes accordingly
Publication Date: 2017-Aug-10
The IP.com Prior Art Database
Improving Throughput and Heap Layout in a Multithreaded Generational Garbage Collector by Monitoring Aggregate Rate of Copy Production, Stalled Threads, and Other Variates and Adjusting Copy Cache Sizes Accordingly
Disclosed is a method to achieve a balance between cache size and work queue management in generational garbage collectors (GGCs) by dynamically choosing cache sizes based on feedback obtained from frequent samples of salient runtime metrics from participating GC threads.
Generational garbage collectors (GGCs) operate by traversing reference graphs from a collection of root objects, moving objects from a designated evacuation space into a designated survivor space, and iteratively moving objects referenced by objects so copied. Hierarchical GGCs attempt to improve locality of reference, so that the address (A) of the copied object is as close as possible to the address (B) of the referring slot. *
Hierarchical copying is effective in improving locality only when the containment metric (log2(A XOR B)) is not greater than log2(P) where P is the platform-dependent page size P (P-containment). The Siegwart-Hirzel* algorithm is ineffective in this sense whenever a participating GC thread is aliased to a scan/copy cache of size S with containment K where log2(P) < K < log2(S) (referring slot address in same cache as copied referent address but not in same page). Moreover, an aliased thread is constrained to continue in this manner until it runs out of scan work or copy space or is forced to copy across the tenure boundary. Since aliased threads do not produce any work that can be shared with other threads, this effect tends to drive the work queue to more frequent underflow states in which one or more GC threads are blocked.
The effectiveness of Siegwart-Hirzel is improved by allocating smaller scan/copy caches, up to a point at which the cost of managing the proportionate increase in between-thread traffic involved with the increased number of caches in process becomes excessive.
The novel method described here achieves a balance between cache size and work queue management by dynamically choosing cache based on feedback obtained from frequent samples of salient runtime metrics from participating GC threads. This is achieved by using a multivariate function of the inputs received from aggregate thread sampling of, for example, a proportion of stalled GC threads and a proportion of scanned reference slots that point to unforwarded objects. This function maps the available inputs to a numeric value representing the best estimate of the optimal cache size to allocate when a GC thread requires an allocation of new survivor or tenure space into which to copy. The expected outcome is to maximize effective aliasing as described above as much as possible while maintaining a high rate (bytes/millisecond) of evacuation during each GGC cycle.
As a GC thread operating within a GGC proceeds with scanning a cache, it encounters references fr...