A Practical Algorithm for Frequent Substring Pattern Mining with Ternary Partitioning
Original Publication Date: 2003-Sep-12
Included in the Prior Art Database: 2003-Sep-12
Disclosed is a novel algorithm which enumerates all substrings appearing more frequently than some threshold in a given string. In addition, this algorithm generates a data structure that is useful for browsing frequent substrings and their contexts in the original text. This algorithm is based on divide-and-conquer approach, that is, it decomposes the mining task into a set of smaller tasks by the ternary partitioning technique.