Browse Prior Art Database

Efficient smooth data categorization over time by split-remerge based optimization

IP.com Disclosure Number: IPCOM000243730D
Publication Date: 2015-Oct-15
Document File: 3 page(s) / 94K

Publishing Venue

The IP.com Prior Art Database

Abstract

Claiming points - A split-merge two-step based approach for categorization smoothing over time -- A technique for efficient split-remerge for N-dimension feature space by cost function evaluation on the generated enumerable solution space -- A technique for fast split-remerge for one-dimension feature space by fast 1-dimension neighborhood search -- A procedure for building a classification model by using the clustering results and relabel the instances according to the classification results which are inconsistent with the raw clustering results

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 56% of the total text.

Page 01 of 3

Efficient smooth data categorization over time by split-remerge based optimization

Facts

    Analytics is ubiquitous for broad business scenarios, and clustering is one of the dominant techniques in practice Analytics are performed in many cases not once for all, but periodically, continuously as a streaming service This problem is especially pronounced in these examples as our projects

HR retention bonus based on staff/skill categorization (our project)

Sales branch expansion profiling (our project)

Sales pipeline lead quality categorization (our project)

End-user grouping based on their dynamic profile and behavior information

Business need

A smooth categorization approach over time is welcomed and more consumable

Reduce dramatic resource, investment change due to unnecessary change

         
More robust to sudden change due to noise, data missing etc. given the initial categorization is reliable based on in-depth interaction and collaboration between the business analytics person and domain experts and the clients However, perform categorization independently for each time cannot produce satisfactory and smooth results, and it involves repeated overhead for team integration
Moreover, perform a sound categorization involves researchand business people's iterative collaboration, which might not be affordable again and again when new data comes
In summary, we define the value as

We provide a consistent categorization module over time, which is expected more robust in accuracy,...