Browse Prior Art Database

Method and System for Log file compression using message bundles and adaptive field redundancy

IP.com Disclosure Number: IPCOM000240303D
Publication Date: 2015-Jan-21
Document File: 7 page(s) / 82K

Publishing Venue

The IP.com Prior Art Database

Abstract

Support and problem determination is a major activity for any software development team and in that, a vital resource is the product or application log files. While running, application log files can run into hundreds of megabytes and many gigabytes even. This poses a major problem when it comes to retriving log files and sending them to customer of software support teams. Introduced is a dynamic log structure aware approach to log file compression which achieves much better compression ratios than traditional structure agnostic approaches.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 24% of the total text.

Page 01 of 7

Method and System for Log file compression using message bundles and adaptive field redundancy

Data compression is an ever green field of research that attracts many researchers to contribute their ideas to it. Given the large amount of data that are being generated/used by the modern applications, newer compression technologies are becoming the need of the hour. One very common example that we can take for large amount of data is the log files that are generated by the applications. Logging has become mandatory for all applications and is considered as a measure of an application's quality. Most of these log files are meant to be in the human readable form so that it serves as a starting point for problem determination. Problem determination scenario can be considered as a typical example for a place where the log files are found to be much useful. When ever a problem is reported by a customer of a product, log files are requested by the support personnel from where they can start determining the cause of the issue. There are applications that generate logs from size of 10's or 100's of MBs to even a few GBs. So when the size of the logs grow bigger, it becomes difficult for the customer or end user of a product to share those logs over the network. So they go for the existing compression techniques to compress the content so that a comparatively smaller amount of data is sent over the network to the support personnel. Though there are generic compression techniques available, type/data specific compression techniques have also gained a good fame since they are more efficient and suit the purpose for a particular type of data. Since the log files are based on a standard structure and a few basic guidelines, using the knowledge of that structure to enhance/achieve compression will result in much better compression ratio for the log files compared to other existing methods.

\

A log file follows a set of rules/specifications defined by the underlying logging framework. Many tools exist in the market that can understand the format of log files to make use of it in various places starting from mere display of data (XSLT) to automated problem determination (Log Analyzer). Log files generally have a message, time, component, sub component, severity and range of values generated by the application on the fly based on the instance and the environment .i.e. thread id, process id etc. herein the method requires using the structure of the log files to improve the compression achieved by using specialized compression techniques for specific fields in the log file.

This technique allows for better compression ratio than regular text compression since log files have content that do not compress well with text compression techniques. Test have shown that we get a minimum of 30 to 40% improvement with this technique as against existing text compression.

All existing text compression techniques would be prior art.

A log file follows a set of rule...