Browse Prior Art Database

Adaptive sampling technique for high volume complex data

IP.com Disclosure Number: IPCOM000010944D
Original Publication Date: 2003-Feb-03
Included in the Prior Art Database: 2003-Feb-03
Document File: 2 page(s) / 42K

Publishing Venue

IBM

Abstract

The article describes an adaptive sampling scheme to regulate the amount of data collected within a data-intense monitoring application.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 50% of the total text.

Page 1 of 2

Adaptive sampling technique for high volume complex data

An important approach for recording performance data is to log events - e.g., method entry and exit events. But, method entry and exit events can also be augmented with additional data , namely arguments and return values. (The value of this is to gain insight into the data flow within the application, and to identify new opportunities for performance improvement based on that insight.) Unfortunately, while very rich in information, this approach (and others) can produce a very large amount of data ... sometimes too much to be handled by a postprocessor. This article describes a technique for reducing the volume of that data through a novel sampling technique.

    The basic instrumentation technique is applied through the use of Aspect Oriented (AO) technology. An AspectJ aspect is described that causes method entry/exit/argument values and return values to be recorded for a method as it is executed. The AspectJ tool effectively interjects this logic into each method in an application. When the method is executed, this informaiton is logged.

    To address this problem the basic technique is extended to be adaptive. The technique records events from all methods until the method has been entered and exited more than a certain threshold number of times (e.g., 1000 times). Once this has happened, a sampling technique is introduced. The technique begins by establishing an initial sampling frequency, (say 100), and then only every 100 entry/exit events are recorded. The technique is general and adaptive in that if this sampling technique still results in too many samples, a new sampling frequency can be identified and put into effect (say, every 500). Because AO technology is used to describe the basic instrumentation, it is very straightforward to extend with this adaptive scheme. The instrumentation aspect is augmented with logic to count the number of times data from the method has been logged; when the threshold is exceeded, the logging logic is then bypassed until the method has subsequently been executed f times (where f = the sampling frequency).

    The result of this approach is that far fewer events are recorded. It means that the mechanism for event recording is not executed (in full) as frequently and so the overhead on the profiled application is substantially reduced.

    Further, the sampling technique can be described in more general terms than the example above. Several basic sampling building blocks can be added, a few of which are:

    - One in N method invocations (as discussed) - every t seconds (a slight variant, but one in...