Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Method and System for Providing a Discriminative Model for Discovering Useful Multi Dimensional Time Series Subsequences Shapelets

IP.com Disclosure Number: IPCOM000244258D
Publication Date: 2015-Nov-26
Document File: 6 page(s) / 239K

Publishing Venue

The IP.com Prior Art Database

Related People

Nurjahan Begum: INVENTOR [+4]

Abstract

A method and system is disclosed for providing a discriminative model for discovering useful multi dimensional time series subsequences shapelets The method and system mines email account usage activities time series from different dimensions using the discriminative model algorithm for discovering and ranking useful multi dimensional time series shapelets

This text was extracted from a Microsoft Word document.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 42% of the total text.

Method and System for Providing a Discriminative Model for Discovering Useful Multi-Dimensional Time Series Subsequences (Shapelets)

Abstract

A method and system is disclosed for providing a discriminative model for discovering useful multi-dimensional time series subsequences (shapelets). The method and system mines email account usage activities (time series) from different dimensions using the discriminative model (algorithm), for discovering and ranking useful multi-dimensional time series shapelets.  

Description

Email account classification is an important problem in email services.  As advertisers spend a lot of money to promote product(s) from email portal, email classification is an important consideration for the advertisers to target advertisements to appropriate users. Despite strong empirical evidence of effectiveness of the elegant nearest neighbor (NN) algorithm for tasks like classification, it is shown that in resource constrained applications, the high time and space complexity of the NN algorithm limits its applicability.  Further, a new primitive for time series data mining, namely shapelet, attracted great research interest due to its significant potential for higher level mining tasks like classification, clustering etc.  Shapelets are time series subsequences which are in some sense the maximal representation of class indicators.  Because shapelets do not need searching and storing the entire dataset, therefore performing classification, clustering etc. using shapelets is much faster.  However, the discovery of useful shapelets itself is very expensive. The problem of high time complexity of shapelet discovery becomes worse when it comes to multi-dimensional data. There needs a mechanism that discovers useful (top) multi-dimensional time series shapelets.

 

Disclosed is a method and system for providing a discriminative model for discovering useful multi-dimensional time series shapelets. The method and system mines email account usage activities (time series) of user using the discriminative model (algorithm), for discovering and ranking useful multi-dimensional time series shapelets so as to optimally classify email accounts and help advertisers to target advertisements. 

In a scenario, a dataset χ with d dimensions and Z classes is considered for generating a list of possible multi-dimensional shapelets from χ by sliding a window of the length associated across all d dimensions.  The algorithm then utilizes the generated list of possible multi-dimensional shapelets and discovers top-K best multi-dimensional shapelets from χ.  Subsequently, for all candidate shapelets, goodness is determined by calculating how well each shapelet can separate the objects in χ.The most heavily used measure to quantify the goodness of shapelets is Information Gain.  In order to measure the Information Gain of shapelets, distance of the shapelets is calculated from all training objects in χ. If N is the total number of tr...