Browse Prior Art Database

Method and System to Identify False Alerts in Error Log

IP.com Disclosure Number: IPCOM000243908D
Publication Date: 2015-Oct-28
Document File: 7 page(s) / 112K

Publishing Venue

The IP.com Prior Art Database

Abstract

We propose a system and method to identify false alerts based on operation log and alert log correlation. Which use pattern learning technologies to learn the correlation pattern between alert log and operation log, and recognize the false alert log based on that.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 56% of the total text.

Page 01 of 7

Method and System to Identify False Alerts in Error Log

In current IT system, log data, is the most essential source for the IT administrator to find out how your system is being used , how it performs, which also records and sends alerts or failed events. Based on that, various kind of log analysis methods could be developed to help to identify problems, diagnose the root cause why system fails and improve maintainability of all the infrastructure.

Obviously, the quality of log data will effect the correctness of log analysis at a tremendous degree. Wrong messages, especially

wrong alerts recorded in the logs will leads to misdiagnoses of the system or server, then totally wrong decisions could be made based on that analysis result. We find the following facts:


1. Various system and application change operations can impact the system states


2. Monitoring system tries to identify the system states' deviation from expected

The current problem is that there is no linkage between those two systems, which leading to false alarms, due to scheduled changes.

To solve this problem, two methods are often used.

1. Set a "change window.

Change can only be performed in change window, then all alarms will be ignored. While, this method would be challenged by continuous delivery & DevOps requirements.


2. Direct integration of two systems.

Integrate the two systems, then delivery system informs the monitoring that some changes will be performed and some alarms should be turned off.

While it also has some shortcomings: (1) It requires very specific integration work; (2) Human needs to define the correlation between changes and alarms

To solve this problem, our main Idea is,
1. Build linkage between operational data and system alerts. Leverage the operational data to correct system alerts.
2. Find pattern of operation-alert pair and calculate correlation as the basis for false alert recognition.

1


Page 02 of 7


3. Recognize false alerts online and tag the alert log.

The system architecture is designed as below.

2


Page 03 of 7

Fig 1. System Architecture

The processes is divided into training process and analysis process. The process view is depict as below.

3


Page 04 of 7

Fig 2. Training process

Fig 3. Analysis process

4


Page 05 of 7

As described in Fig. 1, Firs...