Browse Prior Art Database

Effective sort algorithm for partially sorted data array

IP.com Disclosure Number: IPCOM000021457D
Original Publication Date: 2004-Jan-20
Included in the Prior Art Database: 2004-Jan-20
Document File: 4 page(s) / 37K

Publishing Venue

IBM

Abstract

This disclosure describes a effective sort algorithm for partially sorted data array with following characteristics: (1) Input data array consists of more than one sorted data blocks. (2) Each data structure consists of more than one sort key, and the data block is sorted by the 1st key, but not always sorted by the other keys. (3) Most data can be identified by the 1st key without duplication. In most application system, the standard sort algorithm, such as merge-sort, quick-sort, etc., could be considered for the solution. However, the time critical system, like Monte Carlo simulation, requires the high performance sort algorithm like the technique described in this article.

This text was extracted from a PDF file.
At least one non-text object (such as an image or picture) has been suppressed.
This is the abbreviated version, containing approximately 56% of the total text.

Page 1 of 4

Effective sort algorithm for partially sorted data array

Fig. 1 shows the example. In this case, the 1st sort key is DATE and the 2nd sort key is PRIORITY NUMBER. Data array consists of some data blocks which is already sorted by 1st key (DATE) but not sorted by 2nd key (PRIORITY NUMBER). The PRIORITY NUMBER is referred only when the DATE is duplicated. So, in most case, the PRIORITY NUMBER is set to '1' in this example. Sorting this type of data in general data processing is common. However, there is no effective well-known sort algorithm, like merge-sort, quick-sort, etc, which fits with this type of data.

Fig. 1 Example of data array

Fig. 2 shows visual representation for pre-sorted data array. The disclosed algorithm is as follows:

(1) Identifies the blocks those are already sorted by 1st key with linear scan.

( refer to Fig. 3 sort phase 1 - analyze )

(2) If found the data not sorted by the 2nd key in block during analyze phase 1, sort it partially by 2nd key.

( refer to Fig. 4 sort phase 1 - partial sort )

(3) Sort by the 1st key with merge sorting method.

( refer to 3. Fig. 5 sort phase 2 - merge sort )

Thus, a large number of partially sorted data array can be sorted effectively with this

1st key

2nd key

first block

second block

third block

(1) sorted by first key in each block

0101

1

0203

2

0203

1

(2) case of not sorted by second key

(3) The value of second key is 1 in most case, because of few first key duplication.

(2) case of not sorted by second key

(2) case of not sorted by second key

0527

1

1224

1

0115

1

0115

2

1231

2

1231

1

07...