Browse Prior Art Database

Method to Minimize Buffer Requirements in a Sort/Merge Task

IP.com Disclosure Number: IPCOM000039108D
Original Publication Date: 1987-Apr-01
Included in the Prior Art Database: 2005-Feb-01
Document File: 3 page(s) / 57K

Publishing Venue

IBM

Related People

Arnold, HH: AUTHOR [+3]

Abstract

A method is described which allows allocation of the minimum number of buffers (less than the total available) that can be used without causing an additional merge pass through the data file. The program using this method provides a capability to build lists of field-oriented data, retrieve them, and to sort them on compound data-type keys. Functions are also provided, while reading list data, to allow sequences of records to be accessed repetitively by marking the beginning and end point of the sequence and allowing the user to reposition the read cursor to the beginning of the marked sequence. As records are inserted into the (sorted) list, they are placed in a buffer or buffers. In addition, a record descriptor is inserted in a (Image Omitted) balanced binary sort tree (Fig. 1).

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 53% of the total text.

Page 1 of 3

Method to Minimize Buffer Requirements in a Sort/Merge Task

A method is described which allows allocation of the minimum number of buffers (less than the total available) that can be used without causing an additional merge pass through the data file. The program using this method provides a capability to build lists of field-oriented data, retrieve them, and to sort them on compound data-type keys. Functions are also provided, while reading list data, to allow sequences of records to be accessed repetitively by marking the beginning and end point of the sequence and allowing the user to reposition the read cursor to the beginning of the marked sequence. As records are inserted into the (sorted) list, they are placed in a buffer or buffers. In addition, a record descriptor is inserted in a

(Image Omitted)

balanced binary sort tree (Fig. 1). When the buffer or buffers are filled, the records in them are ordered via use of the sort tree, moved into an output buffer, and written to DASD (Direct Access Storage Device). We will call this set of ordered records a sequence string. After all records are processed, the list is closed. At this time, the DASD file contains multiple sequence strings comprising the list of records to be ordered. To accomplish this, a merge of the sequence strings is required (possibly in multiple merge passes). A merge pass is accomplished by merging sequence strings from the input file through available buffers (one buffer per string) into fewer larger sequence strings on an output file (Fig. 2). This process is continued, through multiple merge passes if necessary, until only one sequence string, comprising the total ordered file, is produced. The key performance consideration in designing this facility is, first, to minimize the number of merge passes (passes through the list file) and, second, to minimize buffer usage. At the beginning of the merge, the number of available buffers is known, as is the number of sequence strings in the list file. The objective of buffer allocation is to use the least number of buffers (less than or equal to the total number of available buffers) that will not force a greater number of merge passes than would be necessary if the total number of available buffers were used. This optimum number of buffers can be determined by taking successive roots of the number of sequence strings until the root of the number of sequence strings is less than or equal to the total n...