Browse Prior Art Database

A Practical Algorithm for Frequent Substring Pattern Mining with Ternary Partitioning

IP.com Disclosure Number: IPCOM000019354D
Original Publication Date: 2003-Sep-12
Included in the Prior Art Database: 2003-Sep-12

Publishing Venue

IBM

Abstract

Disclosed is a novel algorithm which enumerates all substrings appearing more frequently than some threshold in a given string. In addition, this algorithm generates a data structure that is useful for browsing frequent substrings and their contexts in the original text. This algorithm is based on divide-and-conquer approach, that is, it decomposes the mining task into a set of smaller tasks by the ternary partitioning technique.