Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Encoding Data into Irrational Magic Numbers for Fast Searching and Comparing

IP.com Disclosure Number: IPCOM000106522D
Original Publication Date: 1993-Nov-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 4 page(s) / 135K

Publishing Venue

IBM

Related People

Georgiou, CJ: AUTHOR [+3]

Abstract

Disclosed is a simple encoding method yielding a unique representation of data objects for fast and efficient processing of lookup, modification, insertion, and deletion operations. This encoding can also be used to capture hierarchical relations between data objects (e.g., tree structures, or DAGs - Directed Acyclic Graphs).

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 41% of the total text.

Encoding Data into Irrational Magic Numbers for Fast Searching and Comparing

      Disclosed is a simple encoding method yielding a unique
representation of data objects for fast and efficient processing of
lookup, modification, insertion, and deletion operations.  This
encoding can also be used to capture hierarchical relations between
data objects (e.g., tree structures, or DAGs - Directed Acyclic
Graphs).

      The disclosed encoding method can be used for large data-bases,
where simple (i.e., non-relational) information searches are
performed most of the time.  It is also useful for applications in
which there is a need to reference a specific data item, among many
similar items.  The encoding can yield a unique object "signature,"
or magic number.  Traditional data base operations can be done on the
magic numbers of the respective data items instead of the information
itself.  For large data records, this can be faster because of less
main memory space needed.

      The objects to be encoded are assumed to be text strings
defined by an alphabet S, in which each letter s sub i in S is a
positive integer (e.g., ASCII code).  Hierarchical (tree-like)
structures of various other data objects can also be encoded, as
shown later.  The disclosed encoding method has three parts: 1) A
pre-computed, fixed constant coefficients table; 2) the evaluation of
a polynomial in specific points; and 3) the approximate
representation of irrational numbers.  These parts are explained
below.

      The coefficient table T has a maximum of N entries, where N is
some upper bound on the object length (i.e., maximum number of
characters in a string).  Each table entry, alpha sub i in T, is the
value of the natural logarithm of a unique prime number (i.e., no two
different entries in the table are equal) divided by the integer part
of the resulting natural logarithm (called the normalization
procedure).

      The evaluation of the polynomial is done as follows: Denoted by
s sub i is the i-th character in a text string of length n, where n
lesym N.  The polynomial yielding a unique representation of the
object is called a magic number MN, and has the form:
alpha sub 1 * s sub 1 + alpha sub 2 * s sub 2 + ellip + alpha sub n *
s sub n

      The evaluation of this polynomial for specific values of S sub
i yields a unique irrational number, i.e., there is a one-to-one
correspondence between a specific object (string) and the magic
number it represents.  This is based on the Schanuel Conjecture [1]
regarding algebraical independence between numbers with certain
properties.

      Irrational numbers require an infinite memory space to display
their value.  Therefore, they must be represented in a computing
system using an approximation.  If a floating point representation is
used, the degree of approximation needed (i.e., the number of bits
allocated for the mantissa and exponent) depends on various factors,
such as the total...