Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

A self optimized monitoring engine that collects just enough data based on the workload characteristics.

IP.com Disclosure Number: IPCOM000199910D
Publication Date: 2010-Sep-21
Document File: 3 page(s) / 108K

Publishing Venue

The IP.com Prior Art Database

Abstract

Resource Monitoring is a system management activity which is crucial to keep the data processing system healthy. Each resource in the environment provides set of performance statistics, a.k.a sensors, to collect the performance data. When monitoring is enabled, the statistics will be collected at regular intervals and will be persisted for later analysis. Generally, this performance data is used to determine the performance bottlenecks at a resource level. The main issue here is that the monitoring has an associated overhead with it. It takes portion of CPU and memory to collect and store the performance statistics. As the statistics grow by number, the overhead to collect them increases because of which the throughput will suffer. Moreover, under stable workloads, the system behaves in much predicted manner. Hence it does not make much sense to collect the redundant information which is not useful in any sense. The existing monitoring solutions provide fine grained control over the statistics that can be controlled dynamically. However, the process is manual and demands live monitoring of the system 24X7. An administrator must be present and monitor the system on live to alter the monitoring settings continuously. Practically this has got limitations. Therefore, the system will always end up collecting the excess data, even when it is not needed.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

A self optimized monitoring engine that collects just enough data based on the workload characteristics.

Solution:

The below graph depicts the general workload patterns observed in a system. The pattern shows that the workload changes quite linearly (or it can be rapidly) and reaches a point where it is constant for long time. The whole concept in the current innovation revolves around this fact.

When workload changes, collect the maximum performance data possible so if there is any

performance repercussions due to workload shift can be analyzed. But once workload remains

constant at one point, the system is stable. Hence, collecting maximum data here is quite expensive and more importantly does not yield any fruitful benefits.

In the proposed solution, a resource monitoring agent(RMA) is defined which is responsible for enabling and controlling the resource level monitoring. Each resource is responsible for organizing its statistics into a)BASIC b)MEDIUM c) HIGH categories. The categorization should be based on the amount of data that is to be collected.

The BASIC level must represent the surface characteristics of a resource such as workload and response time of a resource. The medium and high levels must contain statistics that helps in understanding the internal working of a resource whose details are helpful in performance analysis and debugging. Each resource must also define a policy for RMA to take necessary actions after it analyzes the data available at BASIC level. The policies are always validated against the BASIC data.

BASIC : stddev/variance <0.1

120

100

80

60

40

20

0

Workload

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Time

1

Page 2 of 3

MEDIUM: stddev/variance >0.1 <0.6

HIGH : stddev/variance >0.6 <0.9

When a system is started with monitoring enabled, the RMA will enable monitori...