Browse Prior Art Database

Method and system for tracing with time-delay decisions_v2.1

IP.com Disclosure Number: IPCOM000220179D
Publication Date: 2012-Jul-25
Document File: 6 page(s) / 101K

Publishing Venue

The IP.com Prior Art Database

Abstract

The system is to store all of trace in memory and dump it into file system according to filtering policy when issue happens. the filtering policy that is defined in Rule Repository could be various. Users could define a simple filtering policy to dump all of information in trace buffer. Or users could define a complex filtering policy that would save IO resource. 1 For the exception that is hard to reproduce Trace info in buffer will be captured and printed when it occurred first time. 2 For the exception that occurs randomly Trace info in buffer will be printed to file system only when the issue occurs. Before that, trace system won't request IO resource so lots of IO resource would be saved. 3 For the case where the trace outputting cause performance issue Trace info is stored in buffer, Not file system. Policy in Rule Reposity could filter trace info in buffer and freeze the buffer untill admin to dump buffer manually.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 30% of the total text.

Page 01 of 6

Method and system for tracing with time-delay decisions_v2.1

The tracing component is a fundamental utility in most computer programs. It is responsible for exporting trace information which indicates how program is running. This information is really helpful for program users or program developers to know the cause of issues when the program runs unexpectedly. As trace information is so important for troubleshooting, more detail information output by tracing component will be more useful. As another side of coin, exporting traces is a resource consuming operation. Especially, if trace is stored into permanent storages such as file system, time expensive I/O operation will be involved, so this may impact performance of running program ultimately.

Problems of existing tracing implementations:


1. Are not problem oriented, because they do not take into account if there is or will be a problem before deciding to export traces.

Existing tracing implementations determine whether traces need to be output or not as soon as traces are generated. If traces are exported but there is no problem, computing resources are consumed without achievement. If traces are not exported and problem occurs, no trace information can be leveraged for troubleshooting. Therefore, an ideal tracing component should take into account if there is or will be problem before determining to export trace or not. Unfortunately, when a problem occurs, we always need the traces before problem occurs instead of traces after problem occurs. It's hard for tracing component to forecast whether a problem will occur or not.

For example, in most cases, when a program is running normally, we disable trace component to output trace for achieving better system performance. When an unexpected problem occurs, we will enable trace component to output trace, and then try to reproduce this problem, collecting the trace for troubleshooting if the problem is reproduced. But sometimes, the problem occurs occasionally and randomly, the problem can not be reproduced even with much effort. So it's very difficult to get trace for troubleshooting. Furthermore, for the system that requires high throughput and low latency, if the problem doesn't severely impact system, we are concerned to enable trace component to collect trace for troubleshooting, because exporting trace will consume much system resources and impact system performance eventually.


2. Can not automatically choose proper time to export traces.

Traces are exported at the moment when they are generated no matter the system is busy or not at that time. As traces are used for troubleshooting afterwards, the priority of exporting trace is not higher than executing program logic. Suppose a system is busy on executing logic, it will be better to delay exporting trace at time system is not busy. This problem often happens in production environment.

Traditional tracing system:


Applications send trace information (trace stream) to trace co...