Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

High-Speed Pattern Matching Method

IP.com Disclosure Number: IPCOM000120122D
Original Publication Date: 1991-Mar-01
Included in the Prior Art Database: 2005-Apr-02
Document File: 3 page(s) / 86K

Publishing Venue

IBM

Related People

Mima, Y: AUTHOR [+2]

Abstract

String search is a key functions in data processing. When the target of the search is a set of fixed keywords, it is useful to make an index table of the keywords. On the other hand, such a table is not effective when the target is an arbitrary text string. We propose a method of accelerating the search for an arbitrary string by adding a preprocessing step for text pattern matching.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 60% of the total text.

High-Speed Pattern Matching Method

      String search is a key functions in data processing.
When the target of the search is a set of fixed keywords, it is
useful to make an index table of the keywords.  On the other hand,
such a table is not effective when the target is an arbitrary text
string.  We propose a method of accelerating the search for an
arbitrary string by adding a preprocessing step for text pattern
matching.

      If the volume of the data being searched prevents them from
being placed in the main storage, they must be kept in a slower
secondary storage device.  In this case, simply browsing through the
data takes a long time.  Without indexing, a string search operation
is very time-consuming, especially on a low-speed device.  We attempt
to reduce the total processing time by decreasing unnecessary
accesses to secondary storage.

      Before storing data, we divide the data volume into small
fragments.  Each fragment has corresponding information on the
characters contained (see Fig. 1).

      The format of the additional information for the segment is a
set of flag bits.  The whole character set is grouped into a few
categories such that the number of categories matches the number of
flag bits.  Each flag shows whether a given fragment contains at
least one of the characters in a corresponding category.  The
additional information is generated when the data is stored in the
fragment.

      In a search for an arbitrary st...