Dismiss
InnovationQ will be updated on Sunday, Oct. 22, from 10am ET - noon. You may experience brief service interruptions during that time.
Browse Prior Art Database

External Sorting Data Movement And Cache Improvement Method

IP.com Disclosure Number: IPCOM000101512D
Original Publication Date: 1990-Aug-01
Included in the Prior Art Database: 2005-Mar-16
Document File: 2 page(s) / 73K

Publishing Venue

IBM

Related People

Case, DR: AUTHOR [+2]

Abstract

For many external sorting algorithms, portions of the file are first sorted to create strings, followed by an external merging of strings to form the final sorted file. Often in the sorting phase, data is read from the secondary storage into input buffers in main storage. Each record is then moved to a work area and sorted in the work area. After the data is sorted, each record is moved from the work area to output buffers. The data is then written out to the secondary storage and creates a string.

This text was extracted from an ASCII text file.
This is the abbreviated version, containing approximately 52% of the total text.

External Sorting Data Movement And Cache Improvement Method

       For many external sorting algorithms, portions of the
file are first sorted to create strings, followed by an external
merging of strings to form the final sorted file.  Often in the
sorting phase, data is read from the secondary storage into input
buffers in main storage.  Each record is then moved to a work area
and sorted in the work area.  After the data is sorted, each record
is moved from the work area to output buffers.  The data is then
written out to the secondary storage and creates a string.

      A LRU-managed processor cache is assumed present, and is
assumed to be quite small relative to the size of main storage.

      In order to utilize the processor cache efficiently, portions
of the data in the input buffers are sorted to create a sorted
sublist.  After the data in the input buffers is exhausted, we merge
all the sorted sublists together to form a long sorted string.  We
also constrain ourselves to create a small enough number of sorted
sublists in the first sorting step to avoid the amount of processor
cache misses.

      With this initial sorting algorithm, each record is moved once
and the key of each record is moved twice in main storage during the
initial sorting phase.

      Data is first read from the secondary storage into the input
buffers.  The keys of a subset of records in the input buffers are
then moved to a work area, and arranged in a linked list with a
pointer to the original record in the input buffer.  The keys in the
work area are then sorted in order and moved to another work area to
form a sorted sublist containing the key and pointer back to the
original record.  This process continues until all the records in the
input created in the first sorting step are merged to form a linear
li...