Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method and System for Analyzing Application Data Logs

IP.com Disclosure Number: IPCOM000225401D
Publication Date: 2013-Feb-14
Document File: 5 page(s) / 219K

Publishing Venue

The IP.com Prior Art Database

Related People

Rajesh Chandramohan: INVENTOR [+2]

Abstract

A method and system is disclosed for collecting application data logs from production systems, aggregating the application data logs in a central store and uploading to a Hadoop* system. The collection of the application data logs in Hadoop and processing the application data logs assists in generating defined key value pairs.

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 54% of the total text.

Method and System for Analyzing Application Data Logs

Abstract

A method and system is disclosed for collecting application data logs from production systems, aggregating the application data logs in a central store and uploading to a Hadoop* system.  The collection of the application data logs in Hadoop and processing the application data logs assists in generating defined key value pairs.

Description

Disclosed is a method and system for collecting application data logs from production systems, aggregating the application data logs in a central store and uploading to a Hadoop* system.  The collection of the application data logs in Hadoop and processing the application data logs assists in generating defined key value pairs.

In accordance with the method and system, production log data from all production SMTP servers is obtained as illustrated in Fig. 1. 

Figure 1

The production log data is sent to a central aggregator server and then to a Hadoop grid in hourly intervals.  Data stored in the Hadoop grid is categorized with date and hour.  Further, the data is also categorized by component and by collocation.  At each hourly interval, the data is processed and message count sent by a specific user is generated.  The message count is based on the following equation.

UID - > {ΣLine0, LineN NumRCPTs , Domain ,To}

Here, UID is user id (Primary Key) and NumRCPTs is recipient’s counts.  The number of messages sent by each individual user/id for specific hour/day is computed.

Similarly, the method and system is used for an outgoing mail path.  In case of an outgoing mail path, an outgoing mail may have to undergo multiple hops before it leaves the network of a company as illustrated in Fig. 2.

Figure 2

In each stage of mail traversal, messages have to go though systems such as Web farm, Outbound Mail Proxy, Bullet Mail Proxy and Bullet Mail Newmans which are serially connected.  Bullet Mail Newmans includes multiple queues configured with multiple ports and IPs to priorities and parallelize mail delivery.

In each stage series of data is dumped in hosts i.e., time, message id etc.  This assists in relating a data set with various hops to visualize mail latency and performance as illustrated in Fig. 3.  Additionally,...