Browse Prior Art Database

# Algorithm to Search and Represent Differences Between Objects

IP.com Disclosure Number: IPCOM000107439D
Original Publication Date: 1992-Feb-01
Included in the Prior Art Database: 2005-Mar-21
Document File: 4 page(s) / 173K

IBM

## Related People

Minervini, M: AUTHOR

## Abstract

Disclosed is an algorithm for obtaining a schematic documentation of the differences between two objects both representable as an ordered collection of elements, any pair of which may be operand of the comparison logical operation. The algorithm is described with reference to two mathematical arrays ARA and ARB, shown in Fig. 1, whose elements A01-A18 and B01-B18, respectively, are alphabetical strings.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 51% of the total text.

Algorithm to Search and Represent Differences Between Objects

Disclosed is an algorithm for obtaining a schematic
documentation of the differences between two objects both
representable as an ordered collection of elements, any pair of which
may be operand of the comparison logical operation. The algorithm is
described with reference to two mathematical arrays ARA and ARB,
shown in Fig. 1, whose elements A01-A18 and B01-B18, respectively,
are alphabetical strings.

The arrays, which include some identical sequences of array
elements, may represent the real evolution of the same set of
information along a certain time interval. The basic idea of the
algorithm, whose four logical steps are presented hereinafter, is to
search the collection of the identical sequences of ARA and ARB
(sequence pair collection) that must be as follows:
* the selected sequences do not have overlapping elements on ARA or
ARB;
* each pair of correspondent selected sequences does not cross its
two sequences with sequences of other pairs of the collection (the
elements of each pair must be, on ARA and ARB, either all before or
all after with respect to the elements of any other pair);
* the number of ARA and ARB elements included on selected pairs must
be as big as possible.

In the first step, ARA is sequentially scanned from top to
bottom to search, for each ARA element and using the comparison
operation, the eventual first occurrence of an equal string element
on ARB. Seven pairs of sequences, called SP1...SP7, are obtained on
the array table ARSEQ: the data of each entry in the table are
FEA/LEA and FEB/LEB (first/last sequence element on ARA and the same
for ARB), TSE (total of sequence elements) and SPS (sequence pair
status), which may be 'S' or 'A' (selected or annulled).

In the second step, the algorithm removes, for the selected
sequences, any overlapping of elements which may occur on ARB and
their correspondent elements on ARA. These elements are removed from
the sequences of major index: in some cases this operation may annul
entire sequence pairs (see SP5 and SP6). The algorithm then removes
any crossing pairs, but in observance of the above mentioned
principle, fixes the number of elements of the selected sequence
pairs as big as possible. Fig. 2 shows the situation of the arrays
after the second step.

In the third step, the algorithm tests the possibility of
re-use of the elements of the annulled pairs to select new sequence
pairs.  Thus, A05-B06 and A07-B07 are re-used, obtaining the new
pairs SP2 and SP3 and, moreover, the pairs SP5...