Browse Prior Art Database

Arrangement to Speed Up Sort/Merge by Using Multiple Compares and Exchanges

IP.com Disclosure Number: IPCOM000077212D
Original Publication Date: 1972-Jun-01
Included in the Prior Art Database: 2005-Feb-25
Document File: 7 page(s) / 233K

Publishing Venue

IBM

Related People

Tsui, F: AUTHOR [+2]

Abstract

Closer consideration of the sort/merge operation indicates that there would be advantages in comparing and arranging more than two records at a time. This is true especially in view of the fact that, in processor units of modern design, many fast "local-store" registers are available for use, and high-speed buses are provided as data paths between them. Consider for example the case of comparing and arranging 3 records (in ascending order: A, B, C) as an elementary step in a long sort/ merge operation. This can be done either (a) by taking 2 records at a time and doing it in two phases, or (b) by taking 3 records at a time and getting it done in a single pass. Fig. 1 shows a comparison between these two approaches. Ten cases of possible initial situations are considered.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 28% of the total text.

Page 1 of 7

Arrangement to Speed Up Sort/Merge by Using Multiple Compares and Exchanges

Closer consideration of the sort/merge operation indicates that there would be advantages in comparing and arranging more than two records at a time. This is true especially in view of the fact that, in processor units of modern design, many fast "local-store" registers are available for use, and high-speed buses are provided as data paths between them. Consider for example the case of comparing and arranging 3 records (in ascending order: A, B, C) as an elementary step in a long sort/ merge operation. This can be done either (a) by taking 2 records at a time and doing it in two phases, or (b) by taking 3 records at a time and getting it done in a single pass. Fig. 1 shows a comparison between these two approaches.

Ten cases of possible initial situations are considered. (Actually, there are more than ten, but all the cases which do not need exchanges - such as AAA, AAC, ACC - may be regarded to be practically the same as case 1 - ABC - ). It can be seen that the approach (b) would have the following advantages:
(1) In (b), record 2 needs to be read out only once, whereas in (a) it must be read out twice, once during each pass.
(2) In cases 1 to 3, and 7 to 9, same numbers of exchanges are made in (a) and (b) and the same results are obtained. In cases 6 and 10, one exchange more is made in (a). But a more serious disadvantage in (a) is that in cases 4. 6, and 9 the desired sequence (ABC) is not obtained and it must take a second run to get it finally sorted out.

In the case of a large number of records being sorted, point (1) means that approach (a), reading out all records twice except the first and the last, involves 33% more main-memory accesses than approach (b) which only has to read out half of the records twice, and point (2) implies that, under certain conditions, approach (a) requires many more runs. In general, the number of readouts of records in sorting R records by comparing and arranging two records each time will be 2(R-1), and that by taking r records at a time is approximately r (R-1)/r-1, so that the ratio between them is about 2 (r-1)/r. If r = 5, for instance, the ratio is
1.6, indicating that approach (a) involves about 60 % more memory accesses.

The cases with r = 3 and r = 5 are of special interest because, in sorting, with the records to be taken in groups (of r records each) overlapping each other by 1 record and, therefore, spaced at separations of (r-1) records, the calculation of record addresses can be much simplified by having (r-1) = a power of 2.

In the following, the arrangement will be described using the case with r 3 as a concrete example. The principle is equally well applicable to cases with r having other values (5, 9, etc.). The choice for the value of r depends on the cost/performance considerations.

The basic idea of the arrangement is to create a facility for: comparing arbitrarily specifiable portions of more th...