Browse Prior Art Database

Fast, Restricted Search Algorithm

IP.com Disclosure Number: IPCOM000112292D
Original Publication Date: 1994-Apr-01
Included in the Prior Art Database: 2005-Mar-27
Document File: 2 page(s) / 67K

Publishing Venue

IBM

Related People

Keist, RS: AUTHOR

Abstract

A search algorithm is described which allows for fast searching of a database.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Fast, Restricted Search Algorithm

      A search algorithm is described which allows for fast searching
of a database.

      This search algorithm was used in a situation where there was a
log file of 32-bit data items, and new data items needed to quickly
be processed to determine if they matched any items already in the
log file.  If there was a match, the new data item could be
discarded.  If there was not a match, the new data item needed to be
appended to the log file.

The search algorithm which was developed has several properties worth
mentioning:

1.  There is a setup time that is directly proportional to the number
    events in the log file;

2.  Following the setup time, the time necessary to process the new
    events is directly proportional only to the number of new events;
    and

3.  There are restrictions on the size of the data items which can be
    processed with this algorithm.

      The event space includes all the possible permutations of the
events being processed.  For example, if 20 bits are used to encode
the events, the size of the event space would be 2**20 bits.  The
restriction on the size of the event space occurs due to the
implementation of the search algorithm, as the implementation
involves using an array of bits of size equal to the event space.

The algorithm is implemented as follows.

1.  An array of unsigned integers is created whose size is equal to
    the size of the event space.  The array initially contains all
    zeros.  The events are treated as addresses into the array.  The
    bit addressed by an event indicates whether that event has been
    processed.

2.  The events in the log file are processed first.  The bits in the
    array corresponding to each event in the log file are turned on
    to indicate that those events have been hit.

3.  The new data items are processed next.  The bit for each
    unprocessed event is tested.  If the bit is already set, the
    event is not new and it can be discarded.  If the bit is zero,
    set the bit to one and append the event to the log file.

      The main attraction of the algorithm is the speed with which it
can process new events.  The...