Method for mining evolving association rules using time sensitive information
Publication Date: 2014-May-13
The IP.com Prior Art Database
The present invention maintains a living-in-memory association rule mode. For each rule, besides support and confidence measures, there is time sensitive information maintained. At the beginning, existed association rule model is loaded into memory with support, confidence and other measures initialized for each rule. During an incremental time frame, time sensitive information is calculated dynamically for each time interval. At the end of an incremental time frame, with incremental rules generated, 2 rounds of evolving process are performed. Firstly incremental rules found in the current time frame are merged into existing rules, with support and confidence measures recalculated and new rules added; secondly time sensitive information are used to adjust existing rules, with several cold rules dropped and hot rules’ measures updated. After that, snap shot of association rules is kept for further usage, which contains most up to date information from new data. Then the process goes into the next time frame to continue evolving its living-in-memory association rules. Compared with existing approaches, the present invention is novel to use time sensitive information to evolve association rules, is effective to find recent new useful rules without being weakened by original data, and is efficient because there is no need to scan original data during incremental association mining.
Page 01 of 9
Method for mining evolving association rules using time sensitive information Detailed process of the present invention is depicted in Figure 1 below
Page 02 of 9
Page 03 of 9
Firstly existing association rules are loaded into memory. This is done in the step 101. An example of association rule is
A → B (M = 40%)
where A is the antecedent, B is the consequent, and M is a rule measure which will be defined in the section of rule measure.
Step 102selects an incremental time frame with K intervals for incremental association rules mining. The time frame represents the amount of time in a time period, during which new incoming data will be collected. The time frame can be several weeks, months or years, determined by a specific business scenario. Then it is divided into K equal intervals, with K as an input parameter representing the granularity of the time sensitive analysis in the present invention.
Data collection begins immediately in step 103 during current selected time frame. Besides data collection, step 103also compute time sensitive measures and variables which are defined in rule measure section and rule set adjustment section.
Step 104is to build incremental association rules using new data collected in step 103, which happens at the end of an incremental time frame. The method in the present invention is not confined to specific association rule mining algorithms as it operates on final rules instead of frequent itemsets. Incremental association rule training is performed using only new data collected without original data. This way, recent useful new rules can be found without weakening by original data. Meanwhile, we don't lose any existing rules.
Step 105is to combine incremental association rules into existing rules. This step is a first round of evolving in the present invention, which contains conservative merging and new rules finding. The detail is described in the section of rules merging.
Step 106uses time sensitive information to adjust association rules. This step is a second round of evolving in the present invention, which is an active evolving strategy and described in section of rule set adjustment.
In step 107, snapshot of updated living-in-memory association rules is kept for further usage, which is described in section of snapshot.
After step 107, we go back to step 102to start a new time frame circle, in which way association rules get refreshed and evolved as time goes by.
Measures of association rules
be a set of literals called items and D be a set of all transactions where each transaction T is a set of items such that . Let and
Page 04 of 9
be set of items such that . An association rule is an implication in the form of , where , and
, where is the number of transactions which contains items sets of and , and is the total number of transactions Confidence
Support and confidence are measures depicted in prior art. Tsupporti and Tconfidenceiare new time sensitive measures define...