Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

Simultaneous Splitting And Garbage Collection of Indexed Storage Areas

IP.com Disclosure Number: IPCOM000101247D
Original Publication Date: 1990-Jul-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 2 page(s) / 91K

Publishing Venue

IBM

Related People

Lyle, RW: AUTHOR

Abstract

A program is disclosed for simultaneously splitting and garbage collecting storage areas (i.e., pages) with a collated index on varying- or fixed-length entries. Whereas traditional methods for garbage collecting would require successive scans on the storage area's index (O(n*n) compares), or sorting of the storage area's index (O(n*log2(n)) compares), this algorithm takes advantage of a new page available at split time to reduce the number of comparisons in the storage area's index to O(n).

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

Simultaneous Splitting And Garbage Collection of Indexed Storage Areas

       A program is disclosed for simultaneously splitting and
garbage collecting storage areas (i.e., pages) with a collated index
on varying- or fixed-length entries.  Whereas traditional methods for
garbage collecting would require successive scans on the storage
area's index (O(n*n) compares), or sorting of the storage area's
index (O(n*log2(n)) compares), this algorithm takes advantage of a
new page available at split time to reduce the number of comparisons
in the storage area's index to O(n).

      In order for the algorithm to work, the entry format on a page
must include the length of the entry (or the length of any variable
length portions of the entry).  Further, if any holes (unused space
between entries) exist before the split, they must be formatted as
entries and contain their length, just like any other entry (but
there will be no offset in the entry map for these holes).  Finally,
the splitting page and the new page must be of equal size.

      The basic function of the algorithm is best described by its
pseudocode.  In the pseudocode, ENTRY_MAP() is the array of offsets
to entries in the page, ordered by the entry values, and SPLIT_POINT
is the number of entries that must be moved to the next page (this
variable may be set a number of different ways, depending on what
kind of split is being done).

      The pseudocode for the algorithm is as follows:
1.   Get new page in a buffer (the page contains binary zeroes)
2.   For I = 1 to number of keys on old page do
3.     Store the integer value I at offset ENTRY_MAP(I) on the new
page
4.   End
5.   ENT_OFS = offset to the first entry stored on the original page
(the entry at the lowest offset in the page, not necessarily the
lowest value).
6.   For I = 1 to number of keys on old page do
7.     ENT_IND = the offset into the original entry map for the I'th
entry on this page, stored at offset ENT_OFS in the new page.
8.     If the entry at offset ENT_OFS in the old page should be moved
to the new page then do
9.       Move this entry to the lowest unused space in the new page
10.      Set ENTRY_MAP(ENT_IND - (     of entries in original page -
SPLIT_POINT)) on the new page to the offset in the new page where
this entry was moved to.
11.    End
12.    Else...