Browse Prior Art Database

Large Table Single-Reference Compression Algorithm

IP.com Disclosure Number: IPCOM000036680D
Original Publication Date: 1989-Oct-01
Included in the Prior Art Database: 2005-Jan-29
Document File: 2 page(s) / 14K

Publishing Venue

IBM

Related People

Cooper, ML: AUTHOR [+4]

Abstract

Disclosed is a technique for maintaining a large in-storage mapping table form which one and only one reference to each input value is expected (so that after the single reference to a table entry occurs, the entry is no longer required and can be discarded), while: using as little storage as possible initially, without requiring large amount of contiguous storage, compressing the table efficiently, while also avoiding storage fragmentation, and maintaining a uniform look-up time for any value mapped by the table.

This text was extracted from a PDF file.
This is the abbreviated version, containing approximately 56% of the total text.

Page 1 of 2

Large Table Single-Reference Compression Algorithm

Disclosed is a technique for maintaining a large in-storage mapping table form which one and only one reference to each input value is expected (so that after the single reference to a table entry occurs, the entry is no longer required and can be discarded), while: using as little storage as possible initially,

without requiring large amount of contiguous

storage,

compressing the table efficiently, while also

avoiding storage fragmentation, and

maintaining a uniform look-up time for any value

mapped by the table.

The solution to this problem consists of a data structure and a set of algorithms for manipulating it, as described herein.

DESCRIPTION OF DATA STRUCTURE: In the data structure:

The HEADER contains control information regarding the

ROWCOUNT/ POWPTR ARRAY and the ROWS. In particular, it

contains:

ENTROW, the number of entries to be contained in a

ROW. This value can be specified by the

application using the data structure, based on the

trade-off described below.

VALIDROWS, the number of ROWs which remain

currently valid in the table.

INITROWS, the number of ROWs which were present

when the table was initialized (also equal to the

number of entries in the following ROWCOUNT/ROWPTR

ARRAY).

The ROWCOUNT/ROWPTR ARRAY contains, for each row in the

look-up table:

ROWCOUNT, a count of the number of valid entries

in the ROW.

ROWPTR, the address of the ROW, if it exists; zero

otherwise.

The HEADER and ROWCOUNT/ROWPTR ARRAY remain in storage from the time the table is built until the last table entry is referenced and the table is no longer required. The ROWS are freed one at a time as they are no longer required. This process is discussed further below. Because the table rows need not be contiguous with other rows or with the HEADER or ROWCOUNT/ROWPTR ARRAY, the requirement for large amounts of contiguous storage is eliminated.

It is possible to use the input value to index into the ROWS as follows: The row number is equal to the input value divided by ENTROW (the number of entries per row); the column number is the remainder of this division (i.e., the input value modulo ENTROW). For example, if there are 100 entries per row, the

1

Page 2 of 2

output value for input value 6157 can be found in the entry at row 61 (=6157/100), column 57 (=6157// 100).

If the value in the ROWCOUNT value is decreased by one each time a value in the corresponding row is referenced, then one can know when a row is no longer needed (the ROWCOUNT value will go to zero). At this time, storage for the entire row can be freed. Since a row may contain any number of entries, storage fragmentation can be grea...